vladimarius / pyap

Python address detector and parser
MIT License
200 stars 60 forks source link

Make space after 'Floor N|Nth Floor' optional #27

Open thedansimonson opened 2 years ago

thedansimonson commented 2 years ago

Overview

The "floor" regular expression in all three regex data modules was specified such that a space trailing the floor component of an address string was mandatory. This means that an address like:

pyap.parse("1234 Book St., 5th Floor, Simbagrad, NM 99999", country="US")

would fail to parse; however, the following would succeed:

pyap.parse("1234 Book St., 5th Floor , Simbagrad, NM 99999", country="US")

I suspect the trailing space was there in analogy with the occupancy regular expression where there are trailing spaces after Suite/Apartment/Room. The trailing space there is actually mandatory, since it separates the occupancy type from the number.

In the case of floor, the number is part of the floor expression, so the space is optional and arguably superfluous, though spaces between floor number and a comma afterward may crop up from time-to-time and wouldn't necessarily inhibit something from being an address.

Changes