openaddresses / machine

Scripts for running OpenAddresses on a complete data set and publishing the results.
http://results.openaddresses.io/
ISC License
97 stars 36 forks source link

Street names capitalized incorrectly #663

Closed vesameskanen closed 7 years ago

vesameskanen commented 7 years ago

This is really a minor issue, but we noticed that all components of street names get capitalized during dataset processing. For example:

Bertel Jungin tie 1, Helsinki -> Bertel Jungin Tie 1, Helsinki.

In above, 'tie' means a 'road'. The local convention is to write only proper nouns with a capital letter, whereas common nouns are written with a lowercase initial.

migurski commented 7 years ago

Thanks @vesameskanen! I just grabbed the newest copy of sources/fi/uusimaa-fi and found these rows:

LON,LAT,NUMBER,STREET,UNIT,CITY,DISTRICT,REGION,POSTCODE,ID,HASH
25.0004067,60.1841129,5,Bertel Jungin tie,,,,,00570,100182933H-1,d290b3253538469a
25.0026579,60.1844612,2,Bertel Jungin tie,,,,,00570,100182940R-1,954774f2536a87f9
25.0006726,60.1838924,6,Bertel Jungin tie,,,,,00570,1001829867-1,44a900a60638452c

It looks like we’re not capitalizing “tie”. Where are you seeing incorrectly capitalized data?

vesameskanen commented 7 years ago

Hi, Thanks for the quick help, and sorry for submitting the issue to a wrong place. The error seems to be in Pelias/OpenAddresses importer which we use in our geocoding solution.

migurski commented 7 years ago

That makes sense! http://pelias.io is a good place to talk with the Pelias team.