marcotcr / checklist

Beyond Accuracy: Behavioral Testing of NLP models with CheckList
MIT License
2.01k stars 204 forks source link

Sources for names and locations? #46

Closed kevinrobinson closed 4 years ago

kevinrobinson commented 4 years ago

hello! 👋

I didn't see in the paper or the repo the sources for the list of names, locations, etc (eg, names.json or lexicons/basic.json. Could you share how you put these together?

Thanks for publishing your work in the open! 👍

kevinrobinson commented 4 years ago

ah sorry for the noise! I see:

Person names and location (country, city) names are multilingual, depending on the editor language. We got the data from wikidata, so there is a bias towards names on wikipedia.

in https://github.com/marcotcr/checklist#lexicons-somewhat-multilingual :)