elifesciences / elife-pubmed-feed

code to support uploading feeds to pubmed for POA articles and VOR articles
1 stars 4 forks source link

Pubmed and PoA v2 #23

Closed Melissa37 closed 9 years ago

Melissa37 commented 10 years ago

Graham: In thinking about Pubmed deposits for a PoA v2, one problem is the PoA XML does not specify a publication date.

So, let's say you published a PoA today Sept 17th, then supplied a v2 on Sept 18th, then the publication date at Pubmed would change from Sept 17th to Sept 18th. Maybe this doesn't matter, but I should check what date the HW site has and also which date TNQ uses as its publication date.

One way to solve this may be to query the Pubmed system, as we discussed, to see if the record already exists and use that publication date.

Any thoughts?

Melissa37 commented 10 years ago

The pub date should not change even on PubMed. When we publish a PoA then the pub date is set. If we do a V2, the other metadata might change, but not the date.

The data that could change is: ArticleTitle AuthorList Author FirstName LastName Suffix Affiliation Identifier Source="ORCID" GroupList Group GroupName IndividualName FirstName LastName Abstract ObjectList Object Type="keyword" Param Name="value">genomic variation

The PubDate PubStatus="aheadofprint" Should remain as the first PoA date.

That's my understanding. M

Melissa37 commented 10 years ago

Just checked the PubMed info and it seems to indicate there is one PoA date until the VoR is published, so we should not change the POA date or add a new PoA date...we could test this by trying to validate a file with 2 aheadofprint dates in it?

aheadofprint - electronic-format without final citation information; to be followed later by a version with final citation information. With this value the PubDate must contain a Year, Month and Day tag that gives the exact date the article was first made publicly available. This PubStatus value plays an important part in the process of submitting Ahead of Print citations.

Melissa37 commented 10 years ago

Just tested it with the following in the Journal section at the start of the file:

2014 August 15 2014 August 17

Does not validate

Melissa37 commented 10 years ago

However, it does validate if you put:

2014 August 15

In the history and have

2014 August 17

In the Journal section at the top of the file. However, this updates the publication date to the new ahead of print date, which is wrong.

So, we should not worry about versioning the PoA content on PubMed by providing multiple dates and should only update article content metadata.

Melissa37 commented 10 years ago

Does that make sense and answer the question?

Cheers M

gnott commented 10 years ago

I think it makes sense that the original date remains the same.

Next decision is where we look for this date, there are a few options, either

Melissa37 commented 10 years ago

Ah, so the problem is that for PoA the date is not in the file, so if we do a version 2 you cannot base it on when it is published as this will be wrong.

Could you review the previous PubMed file for that article and pull the unchanging xml from there ( ie the date)?

gnott commented 9 years ago

The chosen method to determine the published date of a PoA article at this time was to use the PubMed deposit history. This is stored in the S3 bucket in the folders used for the pubmed workflow. Each hour it checks for when an article is published, and so it has an accurate date for each article since about September 2014.