stuartemiddleton / geoparsepy

geoparsepy is a Python geoparsing library that will extract and disambiguate locations from text. It uses a local OpenStreetMap database which allows very high and unlimited geoparsing throughput, unlike approaches that use a third-party geocoding service (e.g. Google Geocoding API). this repository holds Python examples to use the PyPI library.
Other
54 stars 4 forks source link

Broken encoding for cyrillic rows #10

Closed RyabykinIlya closed 4 months ago

RyabykinIlya commented 8 months ago

You write: "Download pre-processed UTF-8 encoded SQL table dumps from OSM image dated dec 2019" But database which generated dumps was English_United States.1252 encoding Because of that we have invalid data in sql files :(

Screen Shot 2023-12-30 at 1 25 51 PM

There are some rows without localization:

Screen Shot 2023-12-30 at 1 45 32 PM

Do I need to Osm2pgsql? Or there is some other solution?

stuartemiddleton commented 4 months ago

lib expects database to be UTF-8 encoded SQL yes, osm2pgsql is needed to create your own SQL dumps. see readme section: Databases needed for preprocessing focus areas (optional)