osm-search / Nominatim

Open Source search based on OpenStreetMap data
https://nominatim.org
GNU General Public License v3.0
3.1k stars 712 forks source link

Improve handling of Russian house numbers #3171

Open n-timofeev opened 1 year ago

n-timofeev commented 1 year ago
The Russian community has an agreement on the format of house numbers, but it is slightly different from how addresses are usually written. Example: "5 литера Б" is "5 литБ" in OSM. This can be fixed by client-side preprocessing, but I think it should be done by the search server. Most common values (dots are optional): Full Abbreviations OSM Example
дом д. - 42
корпус корп. / кор. / к. к 42 к1
литера литер / лит. лит 42 литА
строение стр. / с. с 42 с1
сооружение соор. соор 42 соор1
флигель флиг. / фл. фл 42 фл1

In addition, the search will fail if input string contains additional data such as room number. Perhaps, we should create a list of tokens that can be deleted?

lonvia commented 11 months ago

Also блок -> бл (see https://github.com/osm-search/Nominatim/issues/3049#issuecomment-1763498869)