Closed gnott closed 5 years ago
I've worked with systems where the DOI isn't the preferred article ID or is one of multiple. It got messy in the inevitable upgrade to support multiple IDs.
I would suggest including all data we have available and assuming we'll see multiple article-id
elements:
('event_type', 'preprint-publication'),
('event_desc', 'This article was originally published as a <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1101/118356">preprint</ext-link> on bioRxiv.'),
('event_desc_html', 'This article was originally published as a <a href="https://doi.org/10.1101/118356">preprint</a> on bioRxiv.'),
('uri', 'https://doi.org/10.1101/118356'),
('uri_text', 'preprint'),
('id_list', [
OrderedDict(
('type', 'doi'),
('value', '10.1101/118356'),
('assigning-authority', 'crossref')),
]),
('day', '24'),
('month', '03'),
('year', '2017'),
('date', date_struct(2017, 3, 24)),
('iso-8601-date', '2017-03-24')
])
or something similar.
from the main ticket, ages ago:
Also, I don't know whether this would happen, but what if someone put a preprint in many locations, and they would all be version 0? This would not work.
which would mean that each location the article lived at prior to publication with elife would have it's own ID, and not necessarily a DOI issued by crossref
I think in practice an <event>
in <pub-history>
will not often have multiple <article-id>
tags, but I'm happy to change the id data into a list as you described it, thanks! The output of this function is not used yet, so it is an easy time to do it.
We can ignore the other values we get from <article-id>
in the article, sub-article, or citations, for now, until we also want to expose an id_list
for those data structures.
which would mean that each location the article lived at prior to publication with elife would have it's own ID, and not necessarily a DOI issued by crossref
I think if an article has multiple preprints or multiple versions, each will get its own <event>
tag inside the <pub-history>
tag. How these are stored or displayed on journal remains unknown to me, though, which is where I think the concept of a version 0 originated.
Also, I suspect not every preprint would have a DOI, and instead only a URI. I believe the recent XML change was to reflect how bioRxiv specifically will assign DOI to articles, and eLife XML can specify their location additionally by DOI (as well as the URI in the <event-desc>
).
How these are stored or displayed on journal remains unknown to me, though, which is where I think the concept of a version 0 originated.
It was going to be part of the article's publication history, except with slightly more detail than each item currently has.
@lsh-0 do you have any additional comments before this PR is merged? I think I addressed the possible multiple id values that an element may have.
if the multiple ID values are addressed it should be good to go
👍
Re: issue https://github.com/elifesciences/issues/issues/4284, there's a new XML sample that includes a
doi
. I added it to thepub_history()
output and added a new XML sample for the latest XML.If you might want to review @lsh-0 - I think you're the next possible user of this data.