pelias / parser

natural language classification engine for geocoding
https://parser.demo.geocode.earth
MIT License
55 stars 27 forks source link

Missing address results in autocomplete api #155

Closed pcampospaulo closed 2 years ago

pcampospaulo commented 2 years ago

Describe the bug

The autocomplete api is not returning the expected address result

Steps to Reproduce

Autocomplete request:

https://pelias.github.io/compare/#/v1/autocomplete?layers=address%2Cstreet&focus.point.lat=41.201522&focus.point.lon=-8.6124324&text=rua+godinho+de+faria+1200

Returns 0) Rua Godinho de Faria, São Mamede de Infesta, PO, Portugal 1) Rua Godinho de Faria 255, São Mamede de Infesta, PO, Portugal 2) Rua Godinho de Faria (Antiga EN 14) 451, São Mamede de Infesta, PO, Portugal

But search returns the correct result (https://pelias.github.io/compare/#/v1/search?layers=address%2Cstreet&focus.point.lat=41.201522&focus.point.lon=-8.6124324&text=rua+godinho+de+faria+1200)

0) Rua Godinho de Faria 1200, São Mamede de Infesta, PO, Portugal)

Expected behavior

Using same street with different house number works as expected https://pelias.github.io/compare/#/v1/autocomplete?layers=address%2Cstreet&focus.point.lat=41.201522&focus.point.lon=-8.6124324&text=rua+godinho+de+faria+255

Result: 0) Rua Godinho de Faria 255, São Mamede de Infesta, PO, Portugal

Additional information

I noticed this behaviour also with my addresses from CSV imports, some addresses working very well and others not working.

missinglink commented 2 years ago

Thanks for the bug report, I've updated your links to point to the compare app and remove your credentials.

It looks like the issue is with the difference in solutions produced by the different parsers for search and autocomplete.

Screenshot 2021-11-12 at 16 41 08 Screenshot 2021-11-12 at 16 41 18
missinglink commented 2 years ago
Screenshot 2021-11-12 at 16 44 49
missinglink commented 2 years ago

From my experience it's very uncommon for users to search for only a street with a postcode.

I think a nice solution here would be to de-prioritise {street, postcode} and prioritise {street, housenumber}. A {street, housenumber} solution is much more common and makes more sense in the absence of additional information.

This might require a new 'solver' similar to HouseNumberPositionPenalty to deal with re-scoring the 1.0 scored postcode in these situations.

pcampospaulo commented 2 years ago

I think that may exist other problem using {housenumber, street} the results are the same but 1200 is now parsed as housenumber like in the search

https://pelias.github.io/compare/#/v1/autocomplete?layers=address%2Cstreet&focus.point.lat=41.201522&focus.point.lon=-8.6124324&text=1200+rua+godinho+de+faria

missinglink commented 2 years ago

Thanks for the bug report @pcampospaulo, the fix has been merged but will require also merging https://github.com/pelias/api/pull/1581 before it is available.

missinglink commented 2 years ago

Unfortunately the parser new commit alone will not fix this issue in pelias/api.

The house number 1200 is not available on Rua Godinho de Faria from any of our data providers. The /v1/search endpoint uses the interpolation index to estimate the missing position, but this functionality is not currently enabled for /v1/autocomplete.

In order to resolve the issue you can either add the missing house number to OpenStreetMap (which can be time consuming when adding them one-by-one) or find a new official source of addresses and submit it to the OpenAddresses project.

pcampospaulo commented 2 years ago

Thanks for the fast response and parser fix. The house number 1200 on Rua Godinho de Faria is already available on openaddresses with street name "R Godinho De Faria"

To fix issues like this I tried to import CSV data, but autocomplete was not working:

id,source,layer,name,housenumber,street,postcode,lat,lon 1,test,address,"Rua Conselheiro Costa Aroso, 539","539","Rua Conselheiro Costa Aroso","4470-590",41.242271,-8.638140 2,test,address,"Rua do Jardim, 8","8","Rua do Jardim","4405-827",41.110641,-8.616672 3,test,address,"Rua da Lionesa, 31","31","Rua da Lionesa","",41.210575,-8.624525 4,test,address,"Rua Godinho de Faria, 1200","1200","Rua Godinho de Faria","",41.201519,-8.612433

With this example the autocomplete works for "Rua Conselheiro Costa Aroso 539" but not for the other addresses. Looking in to the response all parsed texts are done correctly. I'm don't know why and if I'm doing something wrong.

missinglink commented 2 years ago

Agh I see, the contracted form R instead of Rua is working on the dev server since the latest code change:

Screenshot 2021-11-19 at 14 57 39

The reason this is happening is that we don't have a synonym between the expanded/contracted form of that street prefix (ie. R=Rua) like we do with the other Spanish language street prefixes (for instance).

The reason for this is that single letter synonyms are a huge performance killer, if someone in the USA typed 'R' then Pelias would match on every Rua street around the world, which would be very slow.

Most authoritative address providers give the street suffix in its expanded form, I wonder if we can update the OpenAddresses import script to use a different column or whether the authority only provides contracted forms of the word?