berkmancenter / namae

Namae (名前) parses personal names and splits them into their component parts.
159 stars 32 forks source link

Parsing Exchange-formatted names? #11

Open retorquere opened 9 years ago

retorquere commented 9 years ago

Exchange uses "Lastname Firstname prefixes", which is probably the most braindead format they could think of, but is there a way to hint Namae to parse these?

inukshuk commented 9 years ago

I'm afraid, no. I'm guessing there is no comma after the last name? If you know for a fact that there are no multi-word last names in your data set, you could insert a comma after the first word perhaps? If the prefixes are lower cased Namae should then pick them up (interpreting them as dropping particles).