rinigus / geocoder-nlp

Geocoder library based on libpostal normalization of libosmscout generated database
MIT License
21 stars 1 forks source link

geocoder: clean database by removing multiple objects with the same name #8

Closed rinigus closed 7 years ago

rinigus commented 7 years ago

check that we dont have the same way shown multiple times for a admin region. select one and remove others

rinigus commented 7 years ago

ways seem to be rather minor, maybe few % of data duplication. many of normalized names are the same. however, reducing them to non-duplicated list with additional many-to-many relationship table leads to the similarly sized SQLite database, as soon as indexes are introduced.