Ruby 3.0 gem that cleanses messy Darwin Core terms like recordedBy or identifiedBy prior to passing to its dependent Namae gem, which executes the parsing. It also produces similarity scores between two given names.
require "dwc_agent"
names = DwcAgent.parse '13267 (male) W.J. Cody; 13268 (female) W.E. Kemp'
=>
[#<struct Namae::Name family="Cody", given="W.J.", suffix=nil, particle=nil, dropping_particle=nil, nick=nil, appellation=nil, title=nil>,
#<struct Namae::Name family="Kemp", given="W.E.", suffix=nil, particle=nil, dropping_particle=nil, nick=nil, appellation=nil, title=nil>]
Parsing is occasionally messy & so it is advisable to make use of the additional clean
method for each parsed name.
require "dwc_agent"
names = DwcAgent.parse 'Chaboo, Bennett, Shin'
=>
[#<struct Namae::Name family=nil, given="Chaboo", suffix=nil, particle=nil, dropping_particle=nil, nick=nil, appellation=nil, title=nil>,
#<struct Namae::Name family=nil, given="Bennett", suffix=nil, particle=nil, dropping_particle=nil, nick=nil, appellation=nil, title=nil>,
#<struct Namae::Name family=nil, given="Shin", suffix=nil, particle=nil, dropping_particle=nil, nick=nil, appellation=nil, title=nil>]
DwcAgent.clean names[0]
=> #<struct Namae::Name family="Chaboo", given=nil, suffix=nil, particle=nil, dropping_particle=nil, nick=nil, appellation=nil, title=nil>
require "dwc_agent"
score = DwcAgent.similarity_score('John C.', 'John')
=> 1.1
Or, from the command-line:
gem install dwc_agent
dwcagent "13267 (male) W.J. Cody; 13268 (female) W.E. Kemp"
=> [{"title":null,"appellation":null,"given":"W.J.","particle":null,"family":"Cody","suffix":null,"dropping_particle":null,"nick":null},{"title":null,"appellation":null,"given":"W.E.","particle":null,"family":"Kemp","suffix":null,"dropping_particle":null,"nick":null}]
gem install dwc_agent
dwcagent-similarity "John C." "John"
=> 1.1
dwc_agent is released under the MIT license.
Bug reports can be filed at https://github.com/bionomia/dwc_agent/issues.
Authors: David P. Shorthouse
Copyright (c) 2024