berkmancenter / namae

Namae (名前) parses personal names and splits them into their component parts.
159 stars 32 forks source link

Prefixed and Compound Word Surnames #3

Closed PatrickTulskie closed 9 years ago

PatrickTulskie commented 10 years ago

Namae doesn't seem to work correctly with prefixed and compound word family names. For example:

>> Namae::Name.parse("Justin Du Bois")
 => #<Name family="Bois" given="Justin Du"> 

More about prefixed and compound family names here: http://www.barbarahenritze.com/index.php/genealogical-research/genealogy-articles/32-prefixes-suffixes-hyphenates-compound-words-and-titles

I'll try to come up with a fix at some point but I figured I'd open up an issue to get the ball rolling.

inukshuk commented 10 years ago

These are really hard to detect. If the word is not capitalized, it would be detected as a particle; but capitalized I haven't come up with a suitable (deterministic) detection strategy. Currently compound family names work fine only when the name is given in sort order with a comma though:

> Namae.parse 'Justin du Bois'
=> [#<Name family="Bois" given="Justin" particle="du">]
> Namae.parse 'Du Bois, Justin'
=> [#<Name family="Du Bois" given="Justin">]
> Namae.parse 'du Bois, Justin'
=> [#<Name family="Bois" given="Justin" particle="du">]
PatrickTulskie commented 9 years ago

Oh awesome. This is in the latest version of the gem?

inukshuk commented 9 years ago

No sorry, the examples above should all work, but like I said compound surnames only really work in 'sort order' (i.e., with a comma).