derek73 / python-nameparser

A simple Python module for parsing human names into their individual components
http://nameparser.readthedocs.org/en/latest/
Other
650 stars 102 forks source link

Handling " - " separators in suffix acronyms #156

Open shartzog opened 2 weeks ago

shartzog commented 2 weeks ago

I've come across some source data that uses " - " rather than ", " as a delimiter for healthcare providers with multiple credentials. Under these conditions, nameparser breaks down pretty badly. Adding "-" to CONSTANTS.suffix_acronyms corrects the bad behavior for the most part, although I'd still prefer to have the individual suffixes identified as shown in the final "MD, DO, DDS" example so that my name formatting remains consistent in both cases. Any thoughts on how this might be solved holistically?

Example:

>>> import nameparser
>>> from nameparser.config import CONSTANTS
>>> nameparser.HumanName("Steven Hardman, RN - CRNA")
<HumanName : [
        title: ''
        first: 'RN'
        middle: '-'
        last: 'Steven Hardman'
        suffix: 'CRNA'
        nickname: ''
]>
>>> nameparser.HumanName("Steven Hardman, MD - DO - DDS")
<HumanName : [
        title: 'MD'
        first: '-'
        middle: 'DO -'
        last: 'Steven Hardman'
        suffix: 'DDS'
        nickname: ''
]>
>>> _ = CONSTANTS.suffix_acronyms.add("-")
>>> nameparser.HumanName("Steven Hardman, MD - DO - DDS")
<HumanName : [
        title: ''
        first: 'Steven'
        middle: ''
        last: 'Hardman'
        suffix: 'MD - DO - DDS'
        nickname: ''
]>
>>> nameparser.HumanName("Steven Hardman, MD - DO - DDS").suffix_list
['MD - DO - DDS']
>>> nameparser.HumanName("Steven Hardman, MD, DO, DDS").suffix_list
['MD', 'DO', 'DDS']
shartzog commented 2 weeks ago

(obviously, a simple replace of " - " with ", " prior to calling HumanName in the first place will also solve my issue, so this one may be more academic than practical 😆)