elifesciences / elife-pubmed-feed

code to support uploading feeds to pubmed for POA articles and VOR articles
1 stars 4 forks source link

replaces tag #10

Closed Melissa37 closed 9 years ago

Melissa37 commented 10 years ago

Does not exist in original PoA deposit, but should be used in V2s and VoRs.

For V2s, it will just update the metadata as the PubStatus="aheadofprint" When VoRs are submitted the PubStatus is updated to "epublish" and this is the trigger to PubMed that the article is finalised and they can progress onto the fixed status on PubMed.

Replacement Files can be used for two purposes: updating an Ahead of Print (AOP) citation or correcting a citation currently in [PubMed - as supplied by publisher] status.

AOP citations eventually become "published" citations by way of the publisher sending a Replacement XML file with completed citation information. These replacement files must use the PubStatus attribute value "ppublish" or "epublish" in the tag in order to replace the AOP citation.

Take the following steps to update an AOP citation:

Update the AOP citation file, adding the finalized citation information. Add the final publication date to the tag, along with the PubStatus="ppublish" or "epublish" attribute. The publication date must be exactly as it appears on the finalized article. See "How should the Publication Date be submitted?" "Move" the existing PubDate with PubStatus="aheadofprint" attribute to the tag. This will enable the citation to retain the AOP publishing date in PubMed. Add a single tag to each

to be updated. The tags should be placed after the tags and before the tags, and should contain the IdType attribute with one of the following values: "pubmed" (default), "pii", "doi" of the citation to be updated. The PubMed ID (PMID) can be located in the Loader Report or by searching for the article in PubMed.

fredolin73 commented 7 years ago

Hi,

what happens if epublish date and aheadofprint date are the same? I saw this in some cases here but do not get the sense of it. It seems to take out the "Epub" date at the citation display at pubmed.

Melissa37 commented 7 years ago

Hi there

At eLife, the epublish date and aheadofprint date are the same date, we do not update the date, just the status of the article. Different publishers may update the publication date and so when they send their replaces XML file to PubMed they change the publication date.

I hope this helps? Melissa

fredolin73 commented 7 years ago

Hello:-) yes, thats interesting thanks for tha info! I saw an XML sheet here, posted by you. It has <PubDate PubStatus="epublish"> <Year>2013</Year> <Month>10</Month> <Day>8</Day>

and <History> <PubDate PubStatus="aheadofprint"> <Year>2013</Year> <Month>10</Month> <Day>8</Day> </PubDate> </History>

here is the link: https://github.com/elifesciences/elife-pubmed-feed/edit/master/elife-2015-10-02-114700.xml

It seems that the usage of a pubstatus within the Pubdate and within the history cancels the latter out. I wonder why because if the upper Pubdate states ppublish, both dates appear!

And maybe you have an idea how to integrate "eCollection" like Plos does it? I can find no hitn in the Pubmed XML help. https://www.ncbi.nlm.nih.gov/pubmed/28235095 Thanks a lot Guido

gnott commented 7 years ago

Looking at some samples of eLife's pubmed deposits, I don't think we include the "aheadofprint" in the history section on any deposits. This seems to have worked fine, I guess because our "epublish" and "aheadofprint" dates are the same. At least this omission of "aheadofprint" date in the history dates is not too important (@Melissa37 maybe those should be there and I missed that part).

We aren't using ecollection, but I did a quick search - on https://www.ncbi.nlm.nih.gov/books/NBK3828/#publisherhelp.How_do_I_use_the__History "How do I use the tag?" it looks like there is a history date type for ecollection

<PubDate PubStatus = "ecollection">

PLOS may be using that in the example you mentioned?

fredolin73 commented 7 years ago

But in the posted code you did use "aheadofprint" within "history" At least here: https://github.com/elifesciences/elife-pubmed-feed/edit/master/elife-2015-10-02-114700.xml But even if it is present it will be overwritten by the epublish pubdate within the upper section wihtin the tag as mentioned above.

Oh, I did not see that "ecolection" part in the helf section. Yes, maybe PLOS uses it within the "history" tag, but this seems somehow a strange place for that publication format.

gnott commented 7 years ago

The file https://github.com/elifesciences/elife-pubmed-feed/edit/master/elife-2015-10-02-114700.xml was a specification that was probably to be followed.

It could be in the code this was implemented, specifically at this line https://github.com/elifesciences/elife-poa-xml-generation/blob/develop/generatePubMedXml.py#L347

I was viewing some of the automated tests cases to see if I could find a history date with "aheadofprint" on them and I didn't find one. When it is running in our production environment, given the correct data, it probably does at an "aheadofprint" history date for VoR articles that were PoA'ed in the past.

fredolin73 commented 7 years ago

@Meliassa37, maybe you know something about the pubstatus=ecollection within the history tag? Is this the "official" way to create the ecollection entry?

@all: maybe someone can answer me another question. When you send the XML files to PMC in order to deposit the full articles there, then, as far as I know PMC will submitt the citations to Pubmed. When submitting to PMC, the tags that are used are different from the tags and elements for the Pubmed submission to Pubmed. I guess there are at least three different XML styles in use and these mus somehow be interpreted to eachother. 1) for submissions to PMC (NML Journla Publishing DTD/NISO JATZ Journals publishing DTD 2) submissions to Pubmed (Pubmed DTD 2.8) 3) Pubmed Citation display ( NMLMedlineCitationSet DTD)

I wonder how the pubdates for an XML submission to PMC with the elements, shown here: https://www.ncbi.nlm.nih.gov/pmc/pmcdoc/tagging-guidelines/article/tags.html#el-pubdate will be transformed and interpreted to the Pubmed/Medline Citation display elements, explained here: https://www.nlm.nih.gov/bsd/licensee/elements_article_source.html E.g. how will pub-type="epreprint" correspond to the NMLMedlineCitationSet DTD.

As you can imagine I am not professional in this, nevertheless I am very inerested in how this work is done and hopw the different pubdate elements conribute to each other, simply, how does Pubmed manage to transform them. Maybe one of you has an idea regarding that. Cheers Guido

Melissa37 commented 7 years ago

Hi there

I just looked at PLOS XML and it uses

It is not within the history section. You can see the XML by clicking on an article and changing the format from Abstract to XML ![screen shot 2017-02-28 at 08 55 01](https://cloud.githubusercontent.com/assets/6051876/23398238/a65bc4ba-fd93-11e6-96a5-febe3194de1b.png) I am really sorry, but I don't know about ecollections, we don't use them. --- When you send the XML files to PMC in order to deposit the full articles there, then, as far as I know PMC will submit the citations to Pubmed. @fredolin73 yes this can be a workflow. We send our content to PubMed ourselves, not via PMC though. When submitting to PMC, the tags that are used are different from the tags and elements for the Pubmed submission to Pubmed. I guess there are at least three different XML styles in use and these mus somehow be interpreted to eachother. @fredolin73 that is correct, PMC uses the JATS DTD (but can convert from a number of DTDs) but the PubMed DTD is different. The best thing is for you to contact PubMed and PMC direct as they will be able to answer your questions properly without an element of guess work! PMC: pmc@ncbi.nlm.nih.gov PubMed: Data Provider Support Team [vog.hin.mln.ibcn@rehsilbup]. Good luck! Melissa
fredolin73 commented 7 years ago

Hi Melissa,

I understand that you are not the right adress for my questions, no Problem:-)

But allow me one more comment: The XML within the Pubmed entry is NOT the same XML which is submitted by the Publisher to Pubmed. The XML used for the Citation shown is the NMLMedlineCitationSetDTD, with the Pubmodels Print; Print-Electronic, Electronic, Electronic-Print and Electronic-eCollection. See here: https://www.nlm.nih.gov/bsd/licensee/elements_article_source.html

So the tags used for the Submission of the Publisher with the DTD PubMed 2.6//EN tags like "epublish", "ppublish" etc will have to get interpreted into These. So one can not directly see what and how the Publisher has submitted his publication model if one Looks at the XML at the Pubmed entry. Like for eLife entrys the XML for the entry shown within Pubmed is <Article PubModel="Electronic"> <Journal> <ISSN IssnType="Electronic">2050-084X</ISSN> <JournalIssue CitedMedium="Internet"> <Volume>6</Volume> <PubDate> <Year>2017</Year> <Month>Feb</Month> <Day>27</Day> </PubDate>

and

<ArticleDate DateType="Electronic"> <Year>2017</Year> <Month>02</Month> <Day>27</Day> </ArticleDate>

and not <PubDate PubStatus="epublish"> etc like it is submitted by you.

You see what I mean? One can not conlcude from the given Pubmed XML to the way the Pubdates were submitted by the Publisher like in this case PLOS. Or am I wrong with this!?

Best wishes Guido

Melissa37 commented 7 years ago

HI Guido

You are right, sorry I was being lazy! It's really hard to know what goes on in the conversion machinery. Maybe your best bet would be to ask PLOS what they send to PubMed and whether they have any comment?

I am sorry to not be more helpful! M