osm-search / Nominatim

Open Source search based on OpenStreetMap data
https://nominatim.org
GNU General Public License v3.0
3.18k stars 712 forks source link

First result doesn't match string name #1623

Closed vinxavier closed 4 years ago

vinxavier commented 4 years ago

I am searching for a street called "Francisco Lisboa", it returns me some results, but the first result doesn't match my string, just the second.

Maybe could be nice use some score for see how strings are similar to show first the higher scores strings.

image

As the image, the first result shows the street "Antonio Francisco Lisboa", but the street that I want is "Francisco Lisboa".

lonvia commented 4 years ago

Actually, neither of these results matches your request completely because both have a 'Rua' in front. Nominatim doesn't really know that 'Rua' is unimportant for you, while 'Antonio' isn't. This kind of distinction requires quite a bit of real-world knowledge and knowledge about how addresses and street names are formulated in the different parts of the world.

vinxavier commented 4 years ago

Kind, but if you see the string for the street have quite a difference. There are algorithms that can calculate how similar a string is. Like the Jaccard Index. Antonio is a huge difference in the query result. It's clear more far from my search "rua antonio ..." then just "rua... " in the front.

https://github.com/ecto/jaccard/blob/master/jaccard.js

mtmail commented 4 years ago

Autovía del Nordeste is different from AV Nordeste by eight characters and it different writing of the same street. Carrer d'Alfons XII is different from Carrer d'Alfons XIII by one character and it's different streets (both exist in Barcelona).

Applying a generic algorithms like Levenshtein distance/LCS/jaccard to compare user supplied input (query string) against results (OSM data) isn't enough.

lonvia commented 4 years ago

Closing in favour of the longer discussion in #679.