pelias / api

HTTP API for Pelias Geocoder
http://pelias.io
MIT License
219 stars 162 forks source link

structured: postalcode param should match multiple fields #1548

Closed missinglink closed 6 months ago

missinglink commented 3 years ago

When querying /v1/search/structured?postalcode=xxx only the parent.postalcode field is targeted. We should additionally check the address_parts.zip field.

This is a particular problem for documents imported via the csv-importer as the field is an "address property", not one derived from a point-in-polygon operation.

missinglink commented 3 years ago

Looking at the code it seems that we have something to handle this but it's only applied to addresses and not to records which do not have a street and housenumber

cozydvlpr commented 6 months ago

Hi @missinglink, is there any progress on this issue? In my use case I need to find addresses using the postal code on the openaddresses source and on custom sources. If it's not hard to implement and I get some guidance, I'm also available to learn more about Pelias and fix this issue.

Disclaimer: i'm a developer but with a bit of experience on ES but not really on node.

missinglink commented 6 months ago

@cozydvlpr could you please provide an example query so I can check this is still an issue?

cozydvlpr commented 6 months ago

Sure, my Pelias includes only Portugal (PT/PRT) data

curl -v -k '<pelias>:4000/v1/search/structured?postalcode=1170-257&sources=oa' | jq

Result

{
  "geocoding": {
    "version": "0.2",
    "attribution": "http://xxxxxx:4000/attribution",
    "query": {
      "parsed_text": {
        "postalcode": "1170-257"
      },
      "size": 10,
      "sources": [
        "openaddresses"
      ],
      "private": false,
      "lang": {
        "name": "English",
        "iso6391": "en",
        "iso6393": "eng",
        "via": "default",
        "defaulted": true
      },
      "querySize": 20
    },
    "engine": {
      "name": "Pelias",
      "author": "Mapzen",
      "version": "1.0"
    },
    "timestamp": 1713272664168
  },
  "type": "FeatureCollection",
  "features": []
}

but if I query ES with

{
  "track_total_hits": false,
  "sort": [
    {
      "_score": {
        "order": "desc"
      }
    }
  ],
  "fields": [
    {
      "field": "*",
      "include_unmapped": "true"
    }
  ],
  "size": 500,
  "version": true,
  "script_fields": {},
  "stored_fields": [
    "*"
  ],
  "runtime_mappings": {},
  "_source": false,
  "query": {
    "bool": {
      "must": [],
      "filter": [
        {
          "match_phrase": {
            "address_parts.zip": "1170-257"
          }
        }
      ],
      "should": [],
      "must_not": [
        {
          "match_phrase": {
            "source": "openstreetmap"
          }
        }
      ]
    }
  },
  "highlight": {
    "pre_tags": [
      "@kibana-highlighted-field@"
    ],
    "post_tags": [
      "@/kibana-highlighted-field@"
    ],
    "fields": {
      "*": {}
    },
    "fragment_size": 2147483647
  }
}

I can find, for example, this document

{
  "_index": "pelias",
  "_type": "_doc",
  "_id": "openaddresses:address:pt/countrywide:718d026f6672e41b",
  "_version": 4,
  "_score": 1,
  "_source": {
    "center_point": {
      "lon": -9.1258,
      "lat": 38.730071
    },
    "parent": {
      "country": [
        "Portugal"
      ],
      "country_id": [
        "85633735"
      ],
      "country_a": [
        "PRT"
      ],
      "country_source": [
        null
      ],
      "region": [
        "Lisboa"
      ],
      "region_id": [
        "85687367"
      ],
      "region_a": [
        "LI"
      ],
      "region_source": [
        null
      ],
      "locality": [
        "Lisboa"
      ],
      "locality_id": [
        "101752087"
      ],
      "locality_a": [
        null
      ],
      "locality_source": [
        null
      ],
      "localadmin": [
        "Alvalade"
      ],
      "localadmin_id": [
        "1511679077"
      ],
      "localadmin_a": [
        null
      ],
      "localadmin_source": [
        null
      ],
      "neighbourhood": [
        "Madre de Deus"
      ],
      "neighbourhood_id": [
        "85798871"
      ],
      "neighbourhood_a": [
        null
      ],
      "neighbourhood_source": [
        null
      ]
    },
    "name": {
      "default": "3 Av Mouzinho Albuquerque"
    },
    "address_parts": {
      "number": "3",
      "street": "Av Mouzinho Albuquerque",
      "zip": "1170-257"
    },
    "source": "openaddresses",
    "source_id": "pt/countrywide:718d026f6672e41b",
    "layer": "address"
  },
  "fields": {
    "parent.region_id": [
      "85687367"
    ],
    "parent.country.ngram": [
      "Portugal"
    ],
    "parent.region_a.ngram": [
      "LI"
    ],
    "parent.country_a": [
      "PRT"
    ],
    "parent.localadmin_id": [
      "1511679077"
    ],
    "source": [
      "openaddresses"
    ],
    "parent.neighbourhood": [
      "Madre de Deus"
    ],
    "layer": [
      "address"
    ],
    "name.default": [
      "3 Av Mouzinho Albuquerque"
    ],
    "parent.neighbourhood_id": [
      "85798871"
    ],
    "address_parts.number": [
      "3"
    ],
    "parent.region.ngram": [
      "Lisboa"
    ],
    "parent.country_a.ngram": [
      "PRT"
    ],
    "parent.neighbourhood.ngram": [
      "Madre de Deus"
    ],
    "parent.locality_id": [
      "101752087"
    ],
    "parent.country_id": [
      "85633735"
    ],
    "parent.localadmin.ngram": [
      "Alvalade"
    ],
    "address_parts.street": [
      "Av Mouzinho Albuquerque"
    ],
    "center_point": [
      {
        "coordinates": [
          -9.1258,
          38.730071
        ],
        "type": "Point"
      }
    ],
    "parent.country": [
      "Portugal"
    ],
    "parent.localadmin": [
      "Alvalade"
    ],
    "parent.locality.ngram": [
      "Lisboa"
    ],
    "parent.region": [
      "Lisboa"
    ],
    "parent.region_a": [
      "LI"
    ],
    "source_id": [
      "pt/countrywide:718d026f6672e41b"
    ],
    "address_parts.zip": [
      "1170-257"
    ],
    "parent.locality": [
      "Lisboa"
    ]
  }
}
missinglink commented 6 months ago

Try this docker image and let me know how you get on:

missinglink commented 6 months ago
Screenshot 2024-04-16 at 16 42 19
cozydvlpr commented 6 months ago

Thanks for the quick change. I pulled the new image but it did not work. You can see my case here: https://pelias.github.io/compare/#/v1/search/structured?layers=postalcode&postalcode=1170-257&country=Portugal&debug=1

~can be the dash in the postal code the reason why I don't get a result?~

missinglink commented 6 months ago

Looking at the code, a query where you specify only the postalcode will only return results from the postalcode layer, in your case you seem to be looking for an address.

I didn't design this API and I don't recall why it was done that way, presumably in order to avoid matching/returning huge resultsets which would otherwise occur if we returned all addresses in a large postcode area.

cozydvlpr commented 6 months ago

Thanks for your explanation, I understand. Do you have any suggestions to resolve my specific use case?

missinglink commented 6 months ago

If you'd like to find all the OSM elements tagged with a specific postcode then Overpass Turbo is a good option.

cozydvlpr commented 6 months ago

actually I need only to find one address with the given postalcode. In a second call, I'll use its coordinates to restrict the search area