Open-Earth-Foundation / OpenClimate-Schema

Schema for OpenClimate database
Apache License 2.0
7 stars 2 forks source link

How we determine which is the preferred name of an actor? #49

Open yeozy95 opened 1 year ago

yeozy95 commented 1 year ago

DDL has previously just took the first instance of the actor's name appearance in our database as generally the "preferred" version. Do we have a stricter logic for the OpenClimate Schema?

evanp commented 1 year ago

It's a matter of choice, and primarily for UI.

I think the preferred name would be the one we would show as default for a language, and the "name" column in Actor is the fallback in case there are no matches.

Preference might include these factors, in no particular order:

I think for 99% of actors, we'll have only one or two names in any particular language. In general, I'd probably defer to Wikipedia or Wikidata for names, since they have a lot of editors and contributors who put in a lot of time discussing the best name to use!

evanp commented 1 year ago

Maybe we should change this to a Q score. Like 0 -> 1, where 1.0 is the best name to use, and 0 is the worst name to use. If an Actor has lots of names, use the one with the best score.