dataesr / openalex-affiliations

Exhibits cases where OpenAlex affiliation country detection could be improved
MIT License
6 stars 0 forks source link

openalex-affiliation-country

Exhibits cases where OpenAlex affiliation detection could be improved

In OpenAlex, an alignment pipeline is implemented to link raw harvested affiliations (particularly via web crawling I suppose) to standardised affiliations, described by a standard display_name and potentially an id, ror and country_code.

Each issue listed in this repo corresponds to ONE raw affiliation string whose RoR matches is inaccurate or missing.

October 2023 feedback

In mismatch_country_asof_20231017.jsonl, we list some raw_affiliation_string that we detected to provoke country mismatch (and then of course RoR mismatch). There are few (around 50 + variants) but they affect many publications. We provide also some 'contaminated' DOIs, all published since 2013. A few examples:

Some of the mismatches we detected seem explainable (like the bug for ORCiD), but others seem very weird, like the Dnipro State Medical University matched with a Valeo RoR ?

First feedback (now deprecated)

This repo lists a sample of cases where the country of affiliation present in OpenAlex appears potentially incorrect. We have used our own affiliation-matching tool to detect these cases, and this automatic tool itself is not perfect. Nevertheless, we believe that the vast majority of the cases raised here are of interest. We present this data with the following fields

We have separated the data into two files:

e.g "KU, Leuven, Leuven, Belgium" from https://openalex.org/W3085273257 has no country_code in OpenAlex (should be 'BE')

e.g "ANDRA, Ci2A, Soulaines-Dhuys, France" from https://openalex.org/W2802150657 is matched by OpenAlex to "Australian National Drag Racing Association", country_code AU whereas it should be matched to country 'FR'.

mailing list

https://groups.google.com/g/openalex-users/c/QKMM1rxjk9Y