openvenues / libpostal

A C library for parsing/normalizing street addresses around the world. Powered by statistical NLP and open geo data.
MIT License
4.02k stars 417 forks source link

P.R. China is not parsed as a country #253

Open aiacovella opened 6 years ago

aiacovella commented 6 years ago

Given the following query string:

MAOMIAO DEPT OF PROMOTION CENTER FOR INTERNATIONAL COMMERCE & BUSINESS ROOM 305 ZODIAC EXTRAS COURT 36 BAOSHAN JIUCUN, BAOSHAN DISTRICT 201900 SHANGHAI P.R. CHINA

the following was parsed:

[ { "label": "house", "value": "maomiao dept of promotion center for international commerce & business room 305 zodiac extras court" }, { "label": "house_number", "value": "36" }, { "label": "suburb", "value": "baoshan" }, { "label": "city", "value": "jiucun" }, { "label": "state_district", "value": "baoshan district" }, { "label": "postcode", "value": "201900" }, { "label": "city", "value": "shanghai" }, { "label": "state", "value": "p.r." }, { "label": "country", "value": "china" } ]

Note that "p.r" was parsed as a state rather than as part of the country.

albarrentine commented 6 years ago

Generally for toponyms that libpostal gets incorrect, the first place to look is OSM, which doesn't appear to have that name variant. Feel free to add it under alt_name:en and it will get picked up by libpostal on the next import.

aiacovella commented 6 years ago

I've added the alternate names.

albarrentine commented 6 years ago

Awesome, thanks!