Closed dshorthouse closed 3 years ago
Another possibility:
a.gsub(/^(\S{1,}\.?){1,}\s*(?i:and|&)\s*(\S{1,}\.?){1,}\s*(.*)$/, '\1 \3|\2 \3')
Seems fixed in the sense that it identify three people, but it is assuming they're all given names rather than family names.
2.7.2 :001 > DwcAgent.parse "Chaboo, Bennett, Shin"
=> [#<Name given="Chaboo">, #<Name given="Bennett">, #<Name given="Shin">]
2.7.2 :002 > DwcAgent.parse "Hardy, Andrews & Giuliani"
=> [#<Name given="Hardy">, #<Name given="Andrews">, #<Name given="Giuliani">]
2.7.2 :003 > DwcAgent::Version.version
=> "1.5.1.6"
2.7.2 :004 >
Thanks for having a look at this. Indeed, what's expected here is that each parsed name then needs to be cleaned:
names = DwcAgent.parse "Chaboo, Bennett, Shin"
DwcAgent.clean names[0]
=> {:title=>nil, :appellation=>nil, :given=>nil, :particle=>nil, :family=>"Chaboo", :suffix=>nil}
Many thanks for pointing out DwcAgent.clean
! Was not aware of it. Should it be included in README.md
?
Good point. I'll add that.
Chaboo, Bennett, Shin
Those are all family names but the parser says:
[#<Name family="Chaboo" given="Bennett">, #<Name given="Shin">]