titipata / affiliation_parser

Simple python parser for MEDLINE, Pubmed OA affiliation string
37 stars 15 forks source link

odd and incorrect substitutions in clean_text() function #10

Open simonatdrg opened 6 years ago

simonatdrg commented 6 years ago

Why are the lines affil_text = re.sub('2 ', ' ', affil_text) affil_text = re.sub('2. ', ' ', affil_text) present ?

They create incorrect zip code results with an afiiliation string such as 'Department of Audiology, Speech-Language Pathology & Deaf Studies, Towson University, Towson, MD 21252, USA. ' as the zipcode is incorrectly modified

titipata commented 6 years ago

Oh! Thanks for pointing out @simonatdrg. I was concerned about affiliation string such as 2. Department of .... I'll make changes on that!