uogbuji / amara3-names

Name comparison in python
MIT License
1 stars 1 forks source link

Attempt nickname support #1

Open uogbuji opened 5 years ago

uogbuji commented 5 years ago

Nickname support dropped off since the latest refactoring.

>>> s1 = 'Edward S. Boyden'
>>> s2 = 'Edward Boyden'
>>> s3 = 'Ed Boyden'
>>> from amara3.names.model import human_name
>>> from amara3.names import compare, config
>>> compare.ratio(s1, s2)
0.9285714285714285
>>> compare.ratio(s1, s3)
0.5
>>> compare.ratio(s2, s3)
0.5
uogbuji commented 4 years ago

Actually, I checked more closely and there never was proper nickname support (inherited from whoswho). All it has was that if one included a nickname in quotes it would be factored into comparisons. Renaming ticket.

Adding this will probably be a matter of using some sort of database. There are a few out there, including Deron Meranda's list derived from US Census data, which contains weightings. There's also carltonnorthern's CSV, which doesn't, nor does Brian Lalonde's. Can also use Wikidata as a source.

Philippe Rémy's very large name dataset at least makes an attempt at being international.

TIL: hypocorism is a more formal term for a diminutive form of a name.