pelias / api

HTTP API for Pelias Geocoder
http://pelias.io
MIT License
217 stars 161 forks source link

Regressions between addressit parser and pelias parser #1448

Open rcarroll0452 opened 4 years ago

rcarroll0452 commented 4 years ago

Hi,

The pelias parser seems to do an good job, although, I've noticed a few of regressions between the addressit and the pelias parser versions of the API.

Here's a few I've found:

Query: D-Block mini Park, Dhaka District, Dhaka, Bangladesh addressit version: D-Block mini Park, Dhaka, Bangladesh pelias parser version: Mini Park, NS, Canada

Query: kim jong yu paradise, Pyongyang City, P´yongyang-si, North Korea addressit version: kim jong yu paradise, Pyongyang, North Korea pelias parser version: Yongyang Road, China

Query: Afognak Native Corp, Anchorage, Municipality of Anchorage, Alaska, United States addressit version: Afognak Native Corp, Anchorage, AK, USA pelias parser version: municipality of curepipe, Mauritius

Query: City Auto, Dili, East Timor addressit version: City Auto, Dili, East Timor pelias parser version: Auto City Speedway, Vienna, MI, USA

When debugging more, I've found that parts of the input string which are not classified by the pelias parser are skipped in the DB queries sent to Elasticsearch, which gives results that do not even belong to the same country.

missinglink commented 4 years ago

I had a look at the Alaskan example you posted, I wasn't able to get the result curepipe you mentioned.

https://pelias.github.io/compare/#/v1/autocomplete?text=Afognak+Native+Corp%2C+Anchorage%2C+Municipality+of+Anchorage%2C+Alaska%2C+United+States&debug=0

Please provide more information that we can use to reproduce your findings, links from the compare app like the one above are ideal.

rcarroll0452 commented 4 years ago

Thanks for taking a look at this. I was using Search API.

For the Alaskan example, here is the link, where the first result is curepipe.

If you could take a look at the debug data: the input string Afognak Native Corp is not even present in the Elasticsearch query. Maybe because pelias parser does not classify that at all?

As you can see from this pelias parser URL, Municipality of Anchorage element is classified as street, and Afognak Native Corp is not classified

Here I've used only Afognak Native Corp as the input query, and still the result is at the 10th position even though it is a perfect match compared to the other 9 results