br2s / bug-reports-to-science

An attempt to keep track of how science could be improved
Other
1 stars 0 forks source link

Author associated with someone else's ORCID #16

Open Daniel-Mietchen opened 2 years ago

Daniel-Mietchen commented 2 years ago

In this version, the Wikidata item Repeated Iron-Soot Exposure and Nose-to-brain Transport of Inhaled Ultrafine Particles. (Q47638090) corresponding to the paper Repeated Iron–Soot Exposure and Nose-to-brain Transport of Inhaled Ultrafine Particles has an author (P50) statement pointing to Q90909359 (which back then was labeled "Kent E Pinkerton") with a "series ordinal (P1545)" of 8 and had an ORCID (P496) statement that pointed to https://orcid.org/0000-0001-6868-7572, which is largely empty but features the name "Emilia Laing" and no Kent Pinkerton.

Still in that same version of Repeated Iron-Soot Exposure and Nose-to-brain Transport of Inhaled Ultrafine Particles. (Q47638090), there is an author name string (P2093) statement "Emilia A Laing" with series ordinal 2.

The statements on Q47638090 point to the Europe PMC API call https://www.ebi.ac.uk/europepmc/webservices/rest/search?query=EXT_ID:28914166%20AND%20SRC:MED&resulttype=core&format=json, which has the above ORCID (of Emilia Laing) associated with Kent Pinkerton (in position 8) and no ORCID for the second author (Emilia Laing)

A similar situation was in Q89716066, whose Europe PMC entry also attaches Emilia Laing's ORCID to Kent Pinkerton's author position: https://www.ebi.ac.uk/europepmc/webservices/rest/search?query=EXT_ID:31645209%20AND%20SRC:MED&resulttype=core&format=json and no ORCID to Emilia Laing's position.

Daniel-Mietchen commented 2 years ago

Here is another example (see via https://www.wikidata.org/w/index.php?title=Q98164105&oldid=1369315309 ): In https://www.ebi.ac.uk/europepmc/webservices/rest/search?query=EXT_ID:32727882%20AND%20SRC:MED&resulttype=core&format=json , the ORCID https://orcid.org/0000-0001-8412-2889 of Alfredo Molinolo is associated with Napoleone Ferrara, who is listed just after Molinolo in the author list

Screenshot from 2022-01-10 03-15-09

Daniel-Mietchen commented 2 years ago

Another example: https://www.ebi.ac.uk/europepmc/webservices/rest/search?query=EXT_ID:28661075%20AND%20SRC:MED&resulttype=core&format=json Screenshot from 2022-01-16 04-13-13 Wikidata counterpart: https://www.wikidata.org/w/index.php?title=Q40141441&type=revision&diff=1563020955&oldid=1486970622&diffmode=source

Daniel-Mietchen commented 2 years ago

Here is another interesting case — a published correction due to a valid ORCID having been incorrectly assigned to author in the original publication.

Wikidata has only correct information in this case:

Nevertheless, there should probably be a deprecation statement about the wrong ORCID.

Daniel-Mietchen commented 2 years ago

Side note: As per

Daniel-Mietchen commented 2 years ago

There is a section "2.2.2 Wrongly-attributed ORCID iDs" in the paper "We Can Make a Better Use of ORCID: Five Observed Misapplications", which also touches upon downstream uses as in "A botanical demonstration of the potential of linking data using unique identifiers for people}.

Daniel-Mietchen commented 2 years ago

Here is another paper about data quality issues in ORCID, including use of incorrect identifiers: "Abuse of ORCID’s weaknesses by authors who use paper mills".

Daniel-Mietchen commented 2 years ago

Another published correction due to an incorrect ORCID: https://doi.org/10.1261/rna.078885.121 .

Daniel-Mietchen commented 2 years ago

Here is another case: https://orcid.org/0000-0001-6026-353X is for "Chung-Hsing Lin" but was associated in Wikidata with "Shih-Ann Chen": Screenshot 2022-02-13 at 20-15-53 Author Disambiguator

Not sure what the precise reason is here, but could well be one of those metadata mixups of the kind presented above.

Daniel-Mietchen commented 2 years ago

Another case: https://orcid.org/0000-0003-0545-2156 is for "YaoTing Chang", yet Wikidata had it for "Shih-Lin Chang".

Screenshot 2022-02-13 at 20-27-05 Author Disambiguator

Daniel-Mietchen commented 2 years ago

Next example: https://orcid.org/0000-0002-8547-2992 is for "Eline Vegter" but in https://www.ebi.ac.uk/europepmc/webservices/rest/search?query=EXT_ID:32002631%20AND%20SRC:MED&resulttype=core&format=json , it is associated with "Adrian A Voors", which triggered the creation of Adriaan A Voors (Q89459310) with that ORCID ID.

Screenshot from 2022-02-14 01-15-46

Daniel-Mietchen commented 1 year ago

Another example, seen via https://www.wikidata.org/wiki/Q90224273 and https://www.wikidata.org/w/index.php?title=Q63951761&type=revision&diff=1736391410&oldid=1482385309&diffmode=source :

Interestingly, the two ORCIDs differ in two ways, highlighted here in bold: 0000-0001-9606-4671 vs. 0000-0001-6906-4617.

image

Daniel-Mietchen commented 1 year ago

Another example: https://api.crossref.org/v1/works/https://doi.org/10.1177/0269216319865414 image and https://orcid.org/0000-0002-8855-4176 image

Seen via https://author-disambiguator.toolforge.org/work_item_oauth.php?id=Q90826707 image

Daniel-Mietchen commented 1 year ago

Another example: https://doi.org/10.1002/JCLA.22235 lists Majid Ghayour Mobarhan with the ORCID 0000-0002-1606-4610 image but that ORCID is associated with " samaneh khakpouri": image

Seen via https://author-disambiguator.toolforge.org/names_oauth.php?precise=0&name=Majid+Ghayour-Mobarhan&doit=Look+for+author&limit=500&filter=wdt%3AP50+wd%3AQ86082652&filter_authors=1 : image

Daniel-Mietchen commented 1 year ago

Another example: https://api.crossref.org/v1/works/10.1021/ACSSYNBIO.6B00372 lists http://orcid.org/0000-0002-4910-2222 as being associated with Jay D. Keasling image but that ORCID is actually about Jesus F. Barajas: image

In contrast, the actual Jay D. Keasling (Q1684333) has the ORCID https://orcid.org/0000-0003-4170-6088 : image

Seen via https://author-disambiguator.toolforge.org/names_oauth.php?doit=Look+for+author&name=Jay%20D%20Keasling and https://author-disambiguator.toolforge.org/author_item_oauth.php?id=Q87846476 : Screenshot 2023-01-03 at 20-45-34 Author Disambiguator

Daniel-Mietchen commented 1 year ago

Here is another case, with https://api.crossref.org/v1/works/10.1371/JOURNAL.PMED.1002467


      {
        "ORCID": "http://orcid.org/0000-0002-7125-9998",
        "authenticated-orcid": true,
        "given": "Sonia",
        "family": "Lewycka",
        "sequence": "additional",
        "affiliation": []
      },

listing an author named "Sonia Lewycka" tagged with the ORCID http://orcid.org/0000-0002-7125-9998 , which is associated with someone named "Charles Mwansambo":

Screenshot 2023-02-05 at 03-04-42 Charles Mwansambo (0000-0002-7125-9998)

The actual paper does not list anyone named "Charles Mwansambo" amongst the authors: image

I discovered this via https://author-disambiguator.toolforge.org/names_oauth.php?limit=50&name=Sonia+Lewycka (screenshot):

In there, two candidates are listed for "Sonia Lewycka":

The latter lists two papers, one of which looks suspicious because the author "Sonia Lewycka" appears twice in https://author-disambiguator.toolforge.org/work_item_oauth.php?id=Q46301047 :

Screenshot 2023-02-05 at 02-43-11 Author Disambiguator

It turns out that the author disambiguation was done automatically (by LargeDatasetBot, which used ORCIDs for disambiguation). That workflow was based on PMC Europe data, as per the reference cited for the Sonia Lewycka disambiguation: https://www.ebi.ac.uk/europepmc/webservices/rest/search?query=EXT_ID:29206833%20AND%20SRC:MED&resulttype=core&format=json .


            {
              "fullName": "Lewycka S",
              "firstName": "Sonia",
              "lastName": "Lewycka",
              "initials": "S",
              "authorId": {
                "type": "ORCID",
                "value": "0000-0002-5923-9468"
              },
              "authorAffiliationDetailsList": {
                "authorAffiliation": [
                  {
                    "affiliation": "Nuffield Department of Medicine, Centre for Tropical Medicine, University of Oxford, Oxford, United Kingdom."
                  }
                ]
              }
            },
            {
              "fullName": "Lewycka S",
              "firstName": "Sonia",
              "lastName": "Lewycka",
              "initials": "S",
              "authorId": {
                "type": "ORCID",
                "value": "0000-0002-7125-9998"
              },
              "authorAffiliationDetailsList": {
                "authorAffiliation": [
                  {
                    "affiliation": "Nuffield Department of Medicine, Centre for Tropical Medicine, University of Oxford, Oxford, United Kingdom."
                  }
                ]
              }
            },

Note that the ORCIDs are different: the upper one is correct, the lower one not. So it seems that PMC Europe did try to sanitize the data but did not succeed in this case.

For the sake of completeness, let's also check the data at the version of record:

Screenshot 2023-02-05 at 03-10-25 Effects of women’s groups practising participatory learning and action on preventive and care-seeking behaviours to reduce neonatal mortality A meta-analysis of cluster-randomised trials

So here, the wrong ORCID is present again, and only that.

Daniel-Mietchen commented 1 year ago

Another one: https://doi.org/10.1590/0037-8682-0069-2018 ( https://www.scielo.br/j/rsbmt/a/t9mXhGbTLkYyYgHWHzb7K6m/?lang=en ) has ORCID for Tamires Vital https://orcid.org/0000-0003-3512-7696 associated with Mariana Hecht, as per https://api.crossref.org/v1/works/10.1590/0037-8682-0069-2018 image and https://orcid.org/0000-0003-3512-7696 image

Daniel-Mietchen commented 1 year ago

For an example where such author misidentification would disrupt workflows, see https://www.wikidata.org/wiki/User:Orcbot .

Daniel-Mietchen commented 1 year ago

Another one: https://api.crossref.org/v1/works/10.1080/23294515.2019.1593257 image associates author Kevin P. Weinfurt with the ORCID https://orcid.org/0000-0002-0046-4353 image which is for Flavia Kiweewa Matovu.

Daniel-Mietchen commented 1 year ago

For an example where such author misidentification would disrupt workflows, see https://www.wikidata.org/wiki/User:Orcbot .

On that bot's talk page, there is an example for which it is stated that the ORCID in question has been "locked".

Daniel-Mietchen commented 1 year ago

Another example case:

https://author-disambiguator.toolforge.org/names_oauth.php?doit=Look+for+author&name=H.%20Joosten image

https://author-disambiguator.toolforge.org/author_item_oauth.php?id=Q93057113 image

https://author-disambiguator.toolforge.org/work_item_oauth.php?id=Q93057117 image

https://doi.org/10.1111/plb.13092 image

https://orcid.org/0000-0001-6336-327X Screenshot 2023-05-07 at 06-17-40 Matthias Krebs (0000-0001-6336-327X)

https://api.crossref.org/v1/works/10.1111/plb.13092 image

Daniel-Mietchen commented 1 year ago

The most recent examples all had "authenticated-orcid" set to false, but mishaps happen with true ones too, e.g. as per the entry of Jan 3, 2023.

Daniel-Mietchen commented 1 year ago

Another one, again with "authenticated-orcid" set to false:

https://author-disambiguator.toolforge.org/names_oauth.php?doit=Look+for+author&name=Gerardo%20Moreno image

https://author-disambiguator.toolforge.org/author_item_oauth.php?id=Q57912512 image

https://author-disambiguator.toolforge.org/author_item_oauth.php?id=Q57912512&filter=p%3AP50+%5Bps%3AP50+wd%3AQ57912512%3B+pq%3AP1932+%27Gerardo+Moreno%27%5D image

https://api.crossref.org/v1/works/10.1016/J.ECOSER.2019.100896 image

https://orcid.org/0000-0001-9459-1128 image

Daniel-Mietchen commented 11 months ago

Here is a case where the preprint has the wrong ORCID for one of the authors (Jonathan M. Levine), presumably because of an error in the manuscript submission process to bioRxiv.

Screenshot from 2023-07-23 17-58-19

The ORCID given there (see bottom left in the above screenshot) is https://orcid.org/0000-0003-1510-6062, which has the following content:

Screenshot 2023-07-23 at 18-00-48 Jonathan Levine (0000-0003-1510-6062)

The information thus has propagated — via Crossref — to the ORCID profile of the person associated with the ORCID ID stated in bioRxiv.

The correct ORCID for the author of that manuscript would have been https://orcid.org/0000-0003-2857-7904, which does not list that preprint (nor the journal version) and currently looks as follows: Screenshot 2023-07-23 at 18-05-05 Jonathan Levine (0000-0003-2857-7904)

Interestingly, the error did not propagate to the journal version of the manuscript, presumably because the journal does not require authors to submit their ORCIDs. Screenshot from 2023-07-23 18-08-55

In any case, the authorship assignment has been corrected at Wikidata: https://www.wikidata.org/w/index.php?title=Q56979617&diff=1939235583&oldid=1856036298&diffmode=source.

Daniel-Mietchen commented 11 months ago

Adding screenshots of the Wikidata curation pages

that triggered the discovery of this incongruency and the Wikidata edit to fix it: Screenshot 2023-07-23 at 18-17-49 Author Disambiguator

and

Screenshot 2023-07-23 at 18-14-22 Author Disambiguator

Daniel-Mietchen commented 10 months ago

Another curious example: https://author-disambiguator.toolforge.org/names_oauth.php?doit=Look+for+author&name=Senjie%20Lin

image

That comment researcher 0000-0002-0937-6069 Feng Yang? indicates doubt as to whether the ORCID https://orcid.org/0000-0002-0937-6069 for someone named Feng Yang should be used on an item for someone named Senjie Lin.

The ORCID profile for that Feng Yang is suspicious in that all the papers are listed with an author string Senjie Lin but only one with Feng Yang:

Screenshot 2023-09-17 at 22-40-05 Feng Yang (0000-0002-0937-6069)

At Crossref, that paper only lists one ORCID, and that is the correct one https://orcid.org/0000-0001-8831-6111 for Senjie Lin.

The mixup is in the paper https://api.crossref.org/v1/works/10.1111/jpy.13031 where Senjie Lin is associated with the ORCID https://orcid.org/0000-0002-0937-6069 of Feng Yang :

image

Daniel-Mietchen commented 10 months ago

I have not fixed that Senjie Lin / Feng Yang mixup in Wikidata yet, as I will try to do this via nanopublications.

Daniel-Mietchen commented 10 months ago

Did not see a quick way to do this via nanopublications, so only fixed Senjie Lin / Feng Yang on Wikidata. This turned out a bit more complicated, since there was another item for Senjie Lin, which appeared to be about the same person, so I merged the two entries.

As a result, we now have Q95840069 for Feng Yang, as per https://author-disambiguator.toolforge.org/author_item_oauth.php?id=Q95840069 image

and Q36678842 for Senjie Lin, as per https://scholia.toolforge.org/author/Q36678842 Screenshot 2023-09-18 at 01-15-02 Senjie Lin - Scholia

Daniel-Mietchen commented 10 months ago

Next one: mixup between Armando Pacheco and Washington Ipenza, as per https://author-disambiguator.toolforge.org/names_oauth.php?name=Armando%20J.%20Cabrera%20Pacheco Screenshot 2023-09-18 at 01-28-09 Author Disambiguator and https://author-disambiguator.toolforge.org/author_item_oauth.php?id=Q5666452 Screenshot 2023-09-18 at 01-29-54 Author Disambiguator

I could not find the source of the problem but the edits that brought it to Wikidata:

Fixed it with these two edits:

Melissa37 commented 8 months ago

The statements on Q47638090 point to the Europe PMC API call https://www.ebi.ac.uk/europepmc/webservices/rest/search?query=EXT_ID:28914166%20AND%20SRC:MED&resulttype=core&format=json, which has the above ORCID (of Emilia Laing) associated with Kent Pinkerton (in position 8) and no ORCID for the second author (Emilia Laing)

A similar situation was in Q89716066, whose Europe PMC entry also attaches Emilia Laing's ORCID to Kent Pinkerton's author position: https://www.ebi.ac.uk/europepmc/webservices/rest/search?query=EXT_ID:31645209%20AND%20SRC:MED&resulttype=core&format=json and no ORCID to Emilia Laing's position.

Europe PMC pulls in ORCIDs from PubMed, which are provided by publisher. This traces back to the publisher site

Daniel-Mietchen commented 6 months ago

Another example:

Screenshot from 2024-01-16 04-10-19

Screenshot from 2024-01-16 04-10-49

This is reflected in https://api.crossref.org/v1/works/https://doi.org/10.2337/dc20-0028 :

Screenshot from 2024-01-16 04-13-29

Seen via https://wikidata-game.toolforge.org/distributed/#game=88 for https://author-disambiguator.toolforge.org/author_item_oauth.php?id=Q90942893 and https://author-disambiguator.toolforge.org/author_item_oauth.php?id=Q97527382 :

image

Fixed on Wikidata

Daniel-Mietchen commented 4 months ago

Another example, which I became aware of via this message:

The same error appears in Europe PMC, from where it had found its way into Wikidata.

I made seven edits to fix this on Wikidata:

image