pelias / api

HTTP API for Pelias Geocoder
http://pelias.io
MIT License
217 stars 161 forks source link

address_search_using_ids: point-based geometry issue #1406

Open missinglink opened 4 years ago

missinglink commented 4 years ago

This issue came through via email so I thought I'd write it up, although I think the best course of action is to simply wait until the spatial service matures, then this issue will likely be resolved by buffering point-based geometries.

The example given in the email was 1 Tara Rd, Ennismore, ON.

Looking at the WOF data I can see that Ennismore exists but it's a point-based geometry.

This leads to unexpected behaviour in Pelias, what's happening here is that Placeholder is returning one match for 'Ennismore, ON', which in turn adds the following query clause to the address_search_using_ids query:

{
  "minimum_should_match": 1,
  "should": [
    {
      "terms": {
        "parent.locality_id": [
          "1125901201"
        ]
      }
    }
  ]
}

However, while Placeholder supports point-based geometries PIP does not! As a result, the following query will never match anything because the ID 1125901201 could never be returned by PIP during indexing.

We could fix this fairly easily now by using the bbox property to determine if a Placeholder result is a point or not, but this would require us to choose between two strategies:

I think that neither of these strategies are ideal, it would be much better if we were able to buffer point-based geometries inside the PIP service like this, albeit a wider radius to solve this specific issue.

The new Spatial service will allow us to return nearest-matches as well as a list of potential aliases for admin areas, we can also buffer point geometries so that this is no longer an issue.

NickStallman commented 4 years ago

Its probably difficult or not possible, but you'd almost want to use the point based geometries as the focus point for the search.

I do this in a layer above Pelias for this issue: https://github.com/pelias/pelias/issues/775 For example on a search for: "1 Tara Rd, Ennismore, ON" If it falls back to: "Ennismore, ON" then I do a second query for: "1 Tara Rd, ON" focussed around "Ennismore, ON"

For certain problems this works very nicely and improves the number of exact matches in the case of poor WOF data.

missinglink commented 4 years ago

Yeah the whole findByIDs thing should have really been findWithinBoundingBoxes, it would allow more leniency.