pelias / api

HTTP API for Pelias Geocoder
http://pelias.io
MIT License
221 stars 162 forks source link

Results not ordered by confidence / distance in reverseGeocode #1500

Open mihaicosareanu-bolt opened 3 years ago

mihaicosareanu-bolt commented 3 years ago

Describe the bug

I have found a case where the first reverseGeocode result in the list is further away and with a lower confidence value than the second one.

Steps to Reproduce

https://pelias.github.io/compare/#/v1/reverse?boundary.circle.radius=0.1&size=5&lang=en&point.lat=58.380367&point.lon=26.706314

Expected behavior

I would expect the second result to come before the first one

Pastebin/Screenshots

Result:

{
    "geocoding": {
        "version": "0.2",
        "attribution": "https://geocode.earth/guidelines",
        "query": {
            "size": 5,
            "private": false,
            "point.lat": 58.380367,
            "point.lon": 26.706314,
            "boundary.circle.radius": 0.1,
            "boundary.circle.lat": 58.380367,
            "boundary.circle.lon": 26.706314,
            "lang": {
                "name": "English",
                "iso6391": "en",
                "iso6393": "eng",
                "via": "querystring",
                "defaulted": false
            },
            "querySize": 10
        },
        "engine": {
            "name": "Pelias",
            "author": "Mapzen",
            "version": "1.0"
        },
        "timestamp": 1603884225559
    },
    "type": "FeatureCollection",
    "features": [{
            "type": "Feature",
            "geometry": {
                "type": "Point",
                "coordinates": [
                    26.706578,
                    58.380287
                ]
            },
            "properties": {
                "id": "way/30223056",
                "gid": "openstreetmap:address:way/30223056",
                "layer": "address",
                "source": "openstreetmap",
                "source_id": "way/30223056",
                "name": "3 Taara pst",
                "housenumber": "3",
                "street": "Taara pst",
                "postalcode": "51005",
                "confidence": 0.8,
                "distance": 0.018,
                "accuracy": "point",
                "country": "Estonia",
                "country_gid": "whosonfirst:country:85633135",
                "country_a": "EST",
                "region": "Tartu",
                "region_gid": "whosonfirst:region:85682995",
                "region_a": "TA",
                "county": "Tartu",
                "county_gid": "whosonfirst:county:1713305675",
                "localadmin": "Tartu",
                "localadmin_gid": "whosonfirst:localadmin:1713314027",
                "locality": "Tartu",
                "locality_gid": "whosonfirst:locality:101748151",
                "continent": "Europe",
                "continent_gid": "whosonfirst:continent:102191581",
                "label": "3 Taara pst, Tartu, Estonia",
                "index": 0
            }
        },
        {
            "type": "Feature",
            "geometry": {
                "type": "Point",
                "coordinates": [
                    26.706223,
                    58.380417
                ]
            },
            "properties": {
                "id": "ee/countrywide:8e62ef656c312bf8",
                "gid": "openaddresses:address:ee/countrywide:8e62ef656c312bf8",
                "layer": "address",
                "source": "openaddresses",
                "source_id": "ee/countrywide:8e62ef656c312bf8",
                "name": "5 Taara Puiestee",
                "housenumber": "5",
                "street": "Taara Puiestee",
                "confidence": 0.9,
                "distance": 0.008,
                "accuracy": "point",
                "country": "Estonia",
                "country_gid": "whosonfirst:country:85633135",
                "country_a": "EST",
                "region": "Tartu",
                "region_gid": "whosonfirst:region:85682995",
                "region_a": "TA",
                "county": "Tartu",
                "county_gid": "whosonfirst:county:1713305675",
                "localadmin": "Tartu",
                "localadmin_gid": "whosonfirst:localadmin:1713314027",
                "locality": "Tartu",
                "locality_gid": "whosonfirst:locality:101748151",
                "continent": "Europe",
                "continent_gid": "whosonfirst:continent:102191581",
                "label": "5 Taara Puiestee, Tartu, Estonia",
                "index": 1
            }
        },
        {
            "type": "Feature",
            "geometry": {
                "type": "Point",
                "coordinates": [
                    26.706215,
                    58.380424
                ]
            },
            "properties": {
                "id": "way/30231577",
                "gid": "openstreetmap:address:way/30231577",
                "layer": "address",
                "source": "openstreetmap",
                "source_id": "way/30231577",
                "name": "5 Taara pst",
                "housenumber": "5",
                "street": "Taara pst",
                "postalcode": "51005",
                "confidence": 0.9,
                "distance": 0.009,
                "accuracy": "point",
                "country": "Estonia",
                "country_gid": "whosonfirst:country:85633135",
                "country_a": "EST",
                "region": "Tartu",
                "region_gid": "whosonfirst:region:85682995",
                "region_a": "TA",
                "county": "Tartu",
                "county_gid": "whosonfirst:county:1713305675",
                "localadmin": "Tartu",
                "localadmin_gid": "whosonfirst:localadmin:1713314027",
                "locality": "Tartu",
                "locality_gid": "whosonfirst:locality:101748151",
                "continent": "Europe",
                "continent_gid": "whosonfirst:continent:102191581",
                "label": "5 Taara pst, Tartu, Estonia",
                "index": 2
            }
        },
        {
            "type": "Feature",
            "geometry": {
                "type": "Point",
                "coordinates": [
                    26.706606,
                    58.380273
                ]
            },
            "properties": {
                "id": "ee/countrywide:2c8f15eedfb55b02",
                "gid": "openaddresses:address:ee/countrywide:2c8f15eedfb55b02",
                "layer": "address",
                "source": "openaddresses",
                "source_id": "ee/countrywide:2c8f15eedfb55b02",
                "name": "3 Taara Puiestee",
                "housenumber": "3",
                "street": "Taara Puiestee",
                "confidence": 0.8,
                "distance": 0.02,
                "accuracy": "point",
                "country": "Estonia",
                "country_gid": "whosonfirst:country:85633135",
                "country_a": "EST",
                "region": "Tartu",
                "region_gid": "whosonfirst:region:85682995",
                "region_a": "TA",
                "county": "Tartu",
                "county_gid": "whosonfirst:county:1713305675",
                "localadmin": "Tartu",
                "localadmin_gid": "whosonfirst:localadmin:1713314027",
                "locality": "Tartu",
                "locality_gid": "whosonfirst:locality:101748151",
                "continent": "Europe",
                "continent_gid": "whosonfirst:continent:102191581",
                "label": "3 Taara Puiestee, Tartu, Estonia",
                "index": 3
            }
        }
    ],
    "bbox": [
        26.706215,
        58.380273,
        26.706606,
        58.380424
    ]
}
orangejulius commented 3 years ago

Hi @mihaicosareanu-bolt,

Thanks for the interesting report. According to the Elasticsearch response debug output, it looks like the distance values calculated by Elasticsearch and in the API are different. The results are sorted correctly based on the distances calculated by Elasticsearch (which is where all the sorting is done).

I believe we calculate the distance separately in the API because Elasticsearch uses a less precise calculation of distance, but we probably haven't changed this code since the Elasticsearch 2.x or even 1.x days, so it's possible a different approach makes more sense now.

We currently use the geolib getDistance method, I notice there's also a getPreciseDistance method, and both methods take an accuracy parameter that we don't currently set.

It looks like the difference in the distance values is only a couple meters, so this probably isn't super high priority, but it would definitely be nice if the reverse endpoint did return results in true distance order. If you happen to investigate if there are ways to either use the geolib library better, or that Elasticsearch distance values are now higher quality, let us know! Making changes here should be relatively easy to test and merge.