Open Daniel-Mietchen opened 2 years ago
Here is another example (see via https://www.wikidata.org/w/index.php?title=Q98164105&oldid=1369315309 ): In https://www.ebi.ac.uk/europepmc/webservices/rest/search?query=EXT_ID:32727882%20AND%20SRC:MED&resulttype=core&format=json , the ORCID https://orcid.org/0000-0001-8412-2889 of Alfredo Molinolo is associated with Napoleone Ferrara, who is listed just after Molinolo in the author list
Here is another interesting case — a published correction due to a valid ORCID having been incorrectly assigned to author in the original publication.
Wikidata has only correct information in this case:
Nevertheless, there should probably be a deprecation statement about the wrong ORCID.
Side note: As per
NLM does accept author identifiers when they are supplied by the publisher with the citation data.
There is a section "2.2.2 Wrongly-attributed ORCID iDs" in the paper "We Can Make a Better Use of ORCID: Five Observed Misapplications", which also touches upon downstream uses as in "A botanical demonstration of the potential of linking data using unique identifiers for people}.
Here is another paper about data quality issues in ORCID, including use of incorrect identifiers: "Abuse of ORCID’s weaknesses by authors who use paper mills".
Another published correction due to an incorrect ORCID: https://doi.org/10.1261/rna.078885.121 .
Here is another case: https://orcid.org/0000-0001-6026-353X is for "Chung-Hsing Lin" but was associated in Wikidata with "Shih-Ann Chen":
Not sure what the precise reason is here, but could well be one of those metadata mixups of the kind presented above.
Another case: https://orcid.org/0000-0003-0545-2156 is for "YaoTing Chang", yet Wikidata had it for "Shih-Lin Chang".
Next example: https://orcid.org/0000-0002-8547-2992 is for "Eline Vegter" but in https://www.ebi.ac.uk/europepmc/webservices/rest/search?query=EXT_ID:32002631%20AND%20SRC:MED&resulttype=core&format=json , it is associated with "Adrian A Voors", which triggered the creation of Adriaan A Voors (Q89459310) with that ORCID ID.
Another example, seen via https://www.wikidata.org/wiki/Q90224273 and https://www.wikidata.org/w/index.php?title=Q63951761&type=revision&diff=1736391410&oldid=1482385309&diffmode=source :
Interestingly, the two ORCIDs differ in two ways, highlighted here in bold: 0000-0001-9606-4671 vs. 0000-0001-6906-4617.
Another example:
https://doi.org/10.1002/JCLA.22235 lists Majid Ghayour Mobarhan with the ORCID 0000-0002-1606-4610
but that ORCID is associated with " samaneh khakpouri":
Another example:
https://api.crossref.org/v1/works/10.1021/ACSSYNBIO.6B00372 lists http://orcid.org/0000-0002-4910-2222 as being associated with Jay D. Keasling
but that ORCID is actually about Jesus F. Barajas:
In contrast, the actual Jay D. Keasling (Q1684333) has the ORCID https://orcid.org/0000-0003-4170-6088 :
Seen via https://author-disambiguator.toolforge.org/names_oauth.php?doit=Look+for+author&name=Jay%20D%20Keasling and
https://author-disambiguator.toolforge.org/author_item_oauth.php?id=Q87846476 :
Here is another case, with https://api.crossref.org/v1/works/10.1371/JOURNAL.PMED.1002467
{
"ORCID": "http://orcid.org/0000-0002-7125-9998",
"authenticated-orcid": true,
"given": "Sonia",
"family": "Lewycka",
"sequence": "additional",
"affiliation": []
},
listing an author named "Sonia Lewycka" tagged with the ORCID http://orcid.org/0000-0002-7125-9998 , which is associated with someone named "Charles Mwansambo":
The actual paper does not list anyone named "Charles Mwansambo" amongst the authors:
I discovered this via https://author-disambiguator.toolforge.org/names_oauth.php?limit=50&name=Sonia+Lewycka (screenshot):
In there, two candidates are listed for "Sonia Lewycka":
The latter lists two papers, one of which looks suspicious because the author "Sonia Lewycka" appears twice in https://author-disambiguator.toolforge.org/work_item_oauth.php?id=Q46301047 :
It turns out that the author disambiguation was done automatically (by LargeDatasetBot, which used ORCIDs for disambiguation). That workflow was based on PMC Europe data, as per the reference cited for the Sonia Lewycka disambiguation: https://www.ebi.ac.uk/europepmc/webservices/rest/search?query=EXT_ID:29206833%20AND%20SRC:MED&resulttype=core&format=json .
{
"fullName": "Lewycka S",
"firstName": "Sonia",
"lastName": "Lewycka",
"initials": "S",
"authorId": {
"type": "ORCID",
"value": "0000-0002-5923-9468"
},
"authorAffiliationDetailsList": {
"authorAffiliation": [
{
"affiliation": "Nuffield Department of Medicine, Centre for Tropical Medicine, University of Oxford, Oxford, United Kingdom."
}
]
}
},
{
"fullName": "Lewycka S",
"firstName": "Sonia",
"lastName": "Lewycka",
"initials": "S",
"authorId": {
"type": "ORCID",
"value": "0000-0002-7125-9998"
},
"authorAffiliationDetailsList": {
"authorAffiliation": [
{
"affiliation": "Nuffield Department of Medicine, Centre for Tropical Medicine, University of Oxford, Oxford, United Kingdom."
}
]
}
},
Note that the ORCIDs are different: the upper one is correct, the lower one not. So it seems that PMC Europe did try to sanitize the data but did not succeed in this case.
For the sake of completeness, let's also check the data at the version of record:
So here, the wrong ORCID is present again, and only that.
Another one:
https://doi.org/10.1590/0037-8682-0069-2018 ( https://www.scielo.br/j/rsbmt/a/t9mXhGbTLkYyYgHWHzb7K6m/?lang=en )
has ORCID for Tamires Vital https://orcid.org/0000-0003-3512-7696
associated with Mariana Hecht, as per
https://api.crossref.org/v1/works/10.1590/0037-8682-0069-2018
and
https://orcid.org/0000-0003-3512-7696
For an example where such author misidentification would disrupt workflows, see https://www.wikidata.org/wiki/User:Orcbot .
Another one:
https://api.crossref.org/v1/works/10.1080/23294515.2019.1593257
associates author Kevin P. Weinfurt with the ORCID
https://orcid.org/0000-0002-0046-4353
which is for Flavia Kiweewa Matovu.
For an example where such author misidentification would disrupt workflows, see https://www.wikidata.org/wiki/User:Orcbot .
On that bot's talk page, there is an example for which it is stated that the ORCID in question has been "locked".
Another example case:
https://author-disambiguator.toolforge.org/names_oauth.php?doit=Look+for+author&name=H.%20Joosten
https://author-disambiguator.toolforge.org/author_item_oauth.php?id=Q93057113
https://author-disambiguator.toolforge.org/work_item_oauth.php?id=Q93057117
https://doi.org/10.1111/plb.13092
The most recent examples all had "authenticated-orcid" set to false, but mishaps happen with true ones too, e.g. as per the entry of Jan 3, 2023.
Another one, again with "authenticated-orcid" set to false:
https://author-disambiguator.toolforge.org/author_item_oauth.php?id=Q57912512
https://api.crossref.org/v1/works/10.1016/J.ECOSER.2019.100896
Here is a case where the preprint has the wrong ORCID for one of the authors (Jonathan M. Levine), presumably because of an error in the manuscript submission process to bioRxiv.
The ORCID given there (see bottom left in the above screenshot) is https://orcid.org/0000-0003-1510-6062, which has the following content:
The information thus has propagated — via Crossref — to the ORCID profile of the person associated with the ORCID ID stated in bioRxiv.
The correct ORCID for the author of that manuscript would have been
https://orcid.org/0000-0003-2857-7904, which does not list that preprint (nor the journal version) and currently looks as follows:
Interestingly, the error did not propagate to the journal version of the manuscript, presumably because the journal does not require authors to submit their ORCIDs.
In any case, the authorship assignment has been corrected at Wikidata: https://www.wikidata.org/w/index.php?title=Q56979617&diff=1939235583&oldid=1856036298&diffmode=source.
Adding screenshots of the Wikidata curation pages
that triggered the discovery of this incongruency and the Wikidata edit to fix it:
and
Another curious example: https://author-disambiguator.toolforge.org/names_oauth.php?doit=Look+for+author&name=Senjie%20Lin
That comment researcher 0000-0002-0937-6069 Feng Yang?
indicates doubt as to whether the ORCID https://orcid.org/0000-0002-0937-6069 for someone named Feng Yang
should be used on an item for someone named Senjie Lin
.
The ORCID profile for that Feng Yang
is suspicious in that all the papers are listed with an author string Senjie Lin
but only one with Feng Yang
:
At Crossref, that paper only lists one ORCID, and that is the correct one https://orcid.org/0000-0001-8831-6111 for
Senjie Lin
.
The mixup is in the paper
https://api.crossref.org/v1/works/10.1111/jpy.13031
where Senjie Lin
is associated with the ORCID https://orcid.org/0000-0002-0937-6069 of Feng Yang
:
I have not fixed that Senjie Lin
/ Feng Yang
mixup in Wikidata yet, as I will try to do this via nanopublications.
Did not see a quick way to do this via nanopublications, so only fixed Senjie Lin
/ Feng Yang
on Wikidata. This turned out a bit more complicated, since there was another item for Senjie Lin
, which appeared to be about the same person, so I merged the two entries.
As a result, we now have Q95840069 for Feng Yang
, as per
https://author-disambiguator.toolforge.org/author_item_oauth.php?id=Q95840069
and Q36678842 for Senjie Lin
, as per
https://scholia.toolforge.org/author/Q36678842
Next one: mixup between Armando Pacheco
and Washington Ipenza
, as per
https://author-disambiguator.toolforge.org/names_oauth.php?name=Armando%20J.%20Cabrera%20Pacheco
and
https://author-disambiguator.toolforge.org/author_item_oauth.php?id=Q5666452
I could not find the source of the problem but the edits that brought it to Wikidata:
Fixed it with these two edits:
The statements on Q47638090 point to the Europe PMC API call https://www.ebi.ac.uk/europepmc/webservices/rest/search?query=EXT_ID:28914166%20AND%20SRC:MED&resulttype=core&format=json, which has the above ORCID (of Emilia Laing) associated with Kent Pinkerton (in position 8) and no ORCID for the second author (Emilia Laing)
A similar situation was in Q89716066, whose Europe PMC entry also attaches Emilia Laing's ORCID to Kent Pinkerton's author position: https://www.ebi.ac.uk/europepmc/webservices/rest/search?query=EXT_ID:31645209%20AND%20SRC:MED&resulttype=core&format=json and no ORCID to Emilia Laing's position.
Europe PMC pulls in ORCIDs from PubMed, which are provided by publisher. This traces back to the publisher site
Another example:
This is reflected in https://api.crossref.org/v1/works/https://doi.org/10.2337/dc20-0028 :
Seen via https://wikidata-game.toolforge.org/distributed/#game=88 for https://author-disambiguator.toolforge.org/author_item_oauth.php?id=Q90942893 and https://author-disambiguator.toolforge.org/author_item_oauth.php?id=Q97527382 :
Another example, which I became aware of via this message:
The same error appears in Europe PMC, from where it had found its way into Wikidata.
I made seven edits to fix this on Wikidata:
In this version, the Wikidata item Repeated Iron-Soot Exposure and Nose-to-brain Transport of Inhaled Ultrafine Particles. (Q47638090) corresponding to the paper Repeated Iron–Soot Exposure and Nose-to-brain Transport of Inhaled Ultrafine Particles has an author (P50) statement pointing to Q90909359 (which back then was labeled "Kent E Pinkerton") with a "series ordinal (P1545)" of 8 and had an ORCID (P496) statement that pointed to https://orcid.org/0000-0001-6868-7572, which is largely empty but features the name "Emilia Laing" and no Kent Pinkerton.
Still in that same version of Repeated Iron-Soot Exposure and Nose-to-brain Transport of Inhaled Ultrafine Particles. (Q47638090), there is an author name string (P2093) statement "Emilia A Laing" with series ordinal 2.
The statements on Q47638090 point to the Europe PMC API call https://www.ebi.ac.uk/europepmc/webservices/rest/search?query=EXT_ID:28914166%20AND%20SRC:MED&resulttype=core&format=json, which has the above ORCID (of Emilia Laing) associated with Kent Pinkerton (in position 8) and no ORCID for the second author (Emilia Laing)
A similar situation was in Q89716066, whose Europe PMC entry also attaches Emilia Laing's ORCID to Kent Pinkerton's author position: https://www.ebi.ac.uk/europepmc/webservices/rest/search?query=EXT_ID:31645209%20AND%20SRC:MED&resulttype=core&format=json and no ORCID to Emilia Laing's position.