codeforkjeff / conciliator

OpenRefine reconciliation services for VIAF, ORCID, and Open Library + framework for creating more.
GNU General Public License v3.0
111 stars 22 forks source link

accuracy issue for orcid reconciliation #8

Open msaby opened 6 years ago

msaby commented 6 years ago

Hi Sometimes the results of ORCID reconcilation service is perfect, sometimes it seems broken. See 2 examples for "Igor Ozerov" and "Li Xi" -> for Igor Ozerov, it should be the 1st answer, because this name is unique in Orcid base -> for Li Xi, we should have the list of all Li Xi, and not "Li-Li Xi" or "Li Bo Xi"

Do you think it could be improved?

image

codeforkjeff commented 6 years ago

Hi Mathieu,

conciliator does keyword searches against the entire ORCID bio; that's why results (especially scores) are sometimes imperfect.

I added a "smartnames" mode you can try, which splits up 2-part names and searches on the family-name and given-names fields. It should help with the cases you mentioned, but it probably not in every case (if it can't find any results, it falls back to a keyword search). Add a new service in OpenRefine with this URL:

http://refine.codefork.com/reconcile/orcid/smartnames

Give it a try and see what you think.