Closed gnott closed 7 years ago
volumes are currently hard set according to the year of publication, so what we really need to know is how to set the publication date for a POA article, if it has not already been set (i.e. if it is not a POA article that is being pushed through for a resupply or a repopulation of a site.)
See https://github.com/elifesciences/elife-vendor-workflow-config/issues/118
It is now a task for the archive clean up to add the vol and pub date into the PoA xml. I will update the XML sample to include that.
https://docs.google.com/spreadsheets/d/1nFwQB0USPfJDLPPOYyIe6dTo6XsfJTBrhuPwghD9ZMc/edit#gid=0
See that for url structure
To do in reference to the google sheet, is URLs have "v1" or "v2" in them.
The jats-scraper when creating node paths should include this version value in the path, unless the Drupal site will be altering them after ingest.
It is my preference that the Drupal site not alter the paths after ingest. jats-scraper should be capable of supplying the appropriate url's. I believe it can at the moment. The benefit of jats-scraper being able to generate the existing format of urls (those on the current live site) and the desired urls is that we could use the scraper to help us generate a 301 redirect table for apache or nginx.
Now we have confirmed that we can generate the current paths we should preserve that code but for the generation of the eif-format json we should now be using the preferred paths.
We could create a task for that in jira and schedule it for next sprint perhaps?
It should indeed be able to and it makes sense it does as then those paths are available in the other places fed by the process.
Looking at the code I can't see any versions in paths being produced, either for the article or other assets (e.g. images) so we'll need to add this.
We may have been waiting for that confirmation from Scholar?
Google Scholar have indicated that our suggested URL paths are OK. We are aiming to give them preview access to the site about a month before we go live and they have said that they will try to crawl it to make sure there are not problems that they can identify.
@jhroot the versions in paths will still need to be added perhaps but the correct stubs to fragments has been done but is not supplied to the eif-format json yet. Graham did some work on this front.
@nlisgo I'm not sure what you mean, sorry. ('correct stubs to fragments'). Maybe a terminology issue
@jhroot I just mean the prefix and ordinal output in the url that is associated with the fragment.
Old style: F6 (6th figure or figure supplement regardless of hierarchy) New style: figure2
stubs is probably the wrong word.
Working on jats-scraper to scrape PoA XML, what is the proposed URL or injest path for PoA articles?
PoA XML has no
<volume>
element, and no pub date. The existing VoR path expects the article to have a volume to generate the path, for examplewhere 3 is the volume.
PoA paths right now are like this
If we need a path before the pub date is actually set - in order to injest the article - a possible solution could be