elifesciences / package-ejp-raw-output-zip

Transform article raw zip files from EJP to a more consistent output.
MIT License
0 stars 1 forks source link

Question about <related-article> tag #27

Open gnott opened 5 years ago

gnott commented 5 years ago

I'm looking for potential sample data, and in a raw manifest XML file for I think an Insight article I found this tag:

<related-article ext-link-type="doi" id="ra1" related-article-type="research-article" xlink:href="26-03-2018-RA-eLife-37001"/>

I didn't find any instructions yet for this one @Melissa37. I would assume we can convert the xlink:href to a DOI value and use it like that in the final output?

Melissa37 commented 5 years ago

Thanks for spotting that!

<related-article ext-link-type="doi" id="ra1" related-article-type="commentary" xlink:href="10.7554/eLife.00107"/>

The related-article-type changes depending on what the article is and what it is linking to. See examples below: Research to Research: related-article-type="article-reference" Research to Insight: related-article-type="commentary" Insight to Research: related-article-type="commentary-article" Correction to Research: related-article-type="corrected-article" The related article tagging should only be retained if the article being linked to is in production or published. Otherwise, no related article tagging should be retained.

I will need to ask @JGilbert-eLife to search EJP via SQL for examples of these types of relationships to see what EJP output and whether anything will need to be changed.

I know sometimes articles are linked to themsleves, so it would be good to have logic that deletes these (does not send them to the vendor) and also sometimes there are repeats of the same link, which also should be removed.

We'll write a proper ticket once @JGilbert-eLife is back and can shed some more light on this.

Thanks!

JGilbert-eLife commented 5 years ago

Research to Research: related-article-type="article-reference"

eJP output for research article 38519:

<related-article ext-link-type="doi" id="ra1" related-article-type="research-article" xlink:href="17-05-2018-ISRA-eLife-38472" /> <related-article ext-link-type="doi" id="ra2" related-article-type="research-article" xlink:href="17-05-2018-RA-eLife-38472" /> <related-article ext-link-type="doi" id="ra3" related-article-type="research-article" xlink:href="17-05-2018-RA-eLife-38472" /> <related-article ext-link-type="doi" id="ra4" related-article-type="research-article" xlink:href="17-05-2018-RA-eLife-38472" />

This was a case where article 38519 was linked to both initial and full versions of article 38472. As you can see, four links were output containing the full article numbers, sans the R1 and R2 that were affixed to the numbers of the first and second revisions. Getting multiple related research article links is highly likely since Editorial have to link multiple versions of articles to keep them joined up during the review process (and we never unlink them).

Research to Insight: related-article-type="commentary"

eJP output in XML for research article 38841:

<related-article ext-link-type="doi" id="ra1" related-article-type="commentary" xlink:href="31-08-2018-I-eLife-41633" />

Insight to Research: related-article-type="commentary-article"

eJP output in XML for Insight 42507:

<related-article ext-link-type="doi" id="ra1" related-article-type="research-article" xlink:href="29-04-2018-RA-eLife-37960" />

Correction to Research: related-article-type="corrected-article"

eJp output in XML for Correction notice 43237:

<related-article ext-link-type="doi" id="ra1" related-article-type="research-article" xlink:href="19-09-2017-RA-eLife-32143" />

I think that covers everything you asked for - am I missing anything?