yhydhx / python-nameparser

Automatically exported from code.google.com/p/python-nameparser
Other
0 stars 0 forks source link

Parser does not understand Last First Middle Initial Format #34

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1.Initialize HumanParser with the name ("Decker Brad H")
2.print the results container

What is the expected output? What do you see instead?
Expected: first: brad, Last: Decker, Middle: H
Received: first: Decker, MIddle: "Brad", Last: H

What version of the product are you using? On what operating system?
Ubuntu 12.04, python 3.3.5, module up to date

Please provide any additional information below.

Original issue reported on code.google.com by brad.dec...@lyntonweb.com on 2 Apr 2014 at 4:55

GoogleCodeExporter commented 9 years ago
Supporting the last name before the first without a comma to indicate it is 
beyond the scope of this simple name parser. There is no programatic way to 
know that Brad is a first name without having a dictionary of all possible 
first names. 

It's, in fact, confusing for humans as well when the names could be first or 
last names. For example, last week I met with Ashley John. I kept thinking her 
name was John Ashley. If you put her name in that format you'd think it was for 
sure.

Original comment by dere...@gmail.com on 2 Apr 2014 at 8:34

GoogleCodeExporter commented 9 years ago
Absolutely agree. Its a horrible api that i'm using to access  this. It also 
returns both real human names and business names in the same field "Owner name" 
... 

Anyways, as a solution that works with enough success for my client - 
", ".join(namestring.split(" ")) - the first and last name are about 90% 
accurate as long as its a human name (and within the scope of this tool) thanks 
for looking at this. 

Original comment by brad.dec...@lyntonweb.com on 2 Apr 2014 at 8:37

GoogleCodeExporter commented 9 years ago
Sounds fun :)

Seems like doing that preprocessing on the string before you hand it off to the 
parser might be the best solution if you have a somewhat reliable format that 
you're translating from. 

After I wrote my reply I started thinking about possibly using the fact that 
the single initial is the last piece as an indicator to parse with a different 
format, but my brain kinda melts when I think about it. I think that would 
probably screw up some of the other formats that the parser supports, and it 
would take a few hours to try it to figure out that it doesn't work.

FYI, i pushed out a new release candidate last night that include some nickname 
parsing. Probably won't help with your "owner name" problem though. But if you 
could get the owner names stuck into parenthesis or quotes it would parse that 
as a nickname. Maybe you could do some regex on the string looking for 
potential owner names before you send it to the parser.

If you have any suggestions for ways the parser could make that process any 
easier for you, let me know.

Also just fyi, i'm in the process of moving this repos over to github. feel 
free to post issues in either place in the meantime.

Original comment by dere...@gmail.com on 2 Apr 2014 at 8:48