adsabs / adsabs-dev-api

Developer API service description and example client code
165 stars 58 forks source link

Inconsistent length of the orcid arrays and author list #56

Open cortesec opened 4 years ago

cortesec commented 4 years ago

We found several examples of records where the length of the authors list and that of the ORCID arrays (_pub, _user, _other) is different.

In some cases also the size of the orcid arrays is inconsistent, with a _pub array longer than the others.

We are still investigating the issue and we should be able to provide a long list of bibcode to verify, in the coming days.

A first example is 2014ASPC..486..203M

Best regards

Claudio

csgrant00 commented 4 years ago

Hi Claudio, Looks like a bug! It occurs when a record has editors in addition to authors (so the whole conference series that 2014ASPC..486..203M is a part of is affected). As far as I can tell, it should only be orcid_pub and not orcid_other/orcid_user that's affected, as the _pub field is created separately from the _other/_user fields. But let me know if you see examples where the _other/_user fields are also a different length than the author field, as that points to a different sort of problem. Thanks for pointing this issue out! We'll get it fixed shortly.

Best, Kelly

On Tue, Jan 21, 2020 at 4:35 AM cortesec notifications@github.com wrote:

We found several examples of records where the length of the authors list and that of the ORCID arrays (_pub, _user, _other) is different.

In some cases also the size of the orcid arrays is inconsistent, with a _pub array longer than the others.

We are still investigating the issue and we should be able to provide a long list of bibcode to verify, in the coming days.

A first example is 2014ASPC..486..203M

Best regards

Claudio

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/adsabs/adsabs-dev-api/issues/56?email_source=notifications&email_token=ABKDRFKJT6H3PCTMC6WJCDDQ626XBA5CNFSM4KJQGWRKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4IHSFPWA, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABKDRFMSJRPKNG2E46M4J7LQ626XBANCNFSM4KJQGWRA .

--

Dr. Kelly Lockhart Back-End Developer, NASA Astrophysics Data System Harvard-Smithsonian Center for Astrophysics 60 Garden Street, Cambridge, MA 02138

svank commented 4 years ago

Hi,

I'm coming across some publications with similar issues.

2012P&SS...66...43M has "The Osiris Team" listed as an author, and it looks like ADS is appending (what I assume to be) the membership of the Osiris Team to the author list. The orcid_user field has entries for the nine "real" authors (i.e. excluding "The Osiris Team" and its membership), while orcid_pub has entries for every author in the author field (including "The Osiris Team" and its membership).

2012P&SS...66...64B is another, similar Osiris paper, but both orcid_pub and orcid_user have entries for every entry in the author list, while orcid_other only has entries for the "real" authors.

1981SSRv...30..623V has a single author listed, but has two entries (both '-') for orcid_pub.

Also, for 2012P&SS...66...43M, the API doesn't include "orcid_other" in the response when I request it, and for 1981SSRv...30..623V neither "orcid_other" nor "orcid_user" are included when requested. I'm assuming these fields are excluded from the response when every entry in the list is "-". Is that correct? On the other hand, 2012P&SS...66...64B does return orcid_pub even though every entry is "-".

Thanks, Sam