berkmancenter / namae

Namae (名前) parses personal names and splits them into their component parts.
160 stars 32 forks source link

How do I configure the parser at run time? #26

Closed shawnpyle closed 6 years ago

shawnpyle commented 6 years ago

This is probably a question more than it is an issue. Once I figure this out, I can create a PR to update the documentation.

I'm unable to find any example of how to configure the Namae parser at run time. I've tried:

Namae.parse('Bob Bailey iv') #=> [#<struct Namae::Name family="iv", given="Bob Bailey", suffix=nil, ...
Namae::Parser.new.parse('Bob Bailey iv') #=> [#<struct Namae::Name family="iv", given="Bob Bailey", suffix=nil, ...
# make the suffix case insensitive
Namae.configure { |config| config[:suffix] = /\s*\b(JR|Jr|jr|SR|Sr|sr|[IVX]{2,})(\.|\b)/i }
Namae.parse('Bob Bailey iv') #=> no change
Namae::Parser.new.parse('Bob Bailey iv') #=> [#<struct Namae::Name family="Bailey", given="Bob", suffix="iv", ...

Shouldn't Namae.parse be using the new configuration? It seems like Namae.options and Thread.current[:namae].options do not get updated.

However, Namae::Parser.defaults does get updated. So, assigning Thread.current[:namae] = Namae::Parser.new after the configure call then makes Namae.parse work with the new configuration. Is this assignment required?

Thanks for any help.

shawnpyle commented 6 years ago

Okay, sometimes writing it down helps think through the problem. I missed this in the README (make sure to change the configuration before using the parser). Executing the following does work as expected.

Namae.configure { |config| config[:suffix] = /\s*\b(JR|Jr|jr|SR|Sr|sr|[IVX]{2,})(\.|\b)/i }
Namae.parse('Bob Bailey iv') #=> [#<struct Namae::Name family="Bailey", given="Bob", suffix="iv", ...

I'll see if I can update the class to allow for it to pick up the changes immediately.

inukshuk commented 6 years ago

You need to differentiate between an instance's options and the default configuration; if you change an instance's options, only this instance will be affected; if you change the defaults, every newly created instance will be affected (but not instances which already exist).

The thread-local instances are there really for convenience; if you want to use different parser instances with different options, it's best to create them yourself (you can just do Namae::Parser.new(your_options_here)). If you want to use the thread-local instances but adjust their configuration, just make sure to adjust the defaults before you start using the instances. (If that's not possible, then just create your own instances; note that the global instances are not created unless you access them).

shawnpyle commented 6 years ago

This is no longer an issue.