komoot / photon

an open source geocoder for openstreetmap data
Apache License 2.0
1.83k stars 278 forks source link

Bias on name field match #695

Open xeruf opened 1 year ago

xeruf commented 1 year ago

First of all, thanks for the great tool, this is saving me today!

In an exemplary query I encountered, there is some gibberish before "Hamburg". But regardless of the language parameter, it matches "Novo Hamburgo" higher, ignoring the exact match of the word Hamburg. Maybe this can be nudged slightly, or do you think this result is somehow appropriate? RElated to #193

❯ curl 'https://photon.komoot.io/api/?q=Luftfrachthof-Geb%20Hamburg&limit=1' | jq '.features[0].properties'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   435    0   435    0     0   1329      0 --:--:-- --:--:-- --:--:--  1334
{
  "osm_type": "R",
  "osm_id": 242616,
  "extent": [
    -51.1787892,
    -29.6378807,
    -50.9341246,
    -29.8302389
  ],
  "country": "Brasil",
  "osm_key": "place",
  "countrycode": "BR",
  "osm_value": "city",
  "name": "Novo Hamburgo",
  "county": "Região Geográfica Intermediária de Porto Alegre",
  "state": "Rio Grande do Sul",
  "type": "city"
}

❯ curl 'https://photon.komoot.io/api/?q=Luftfrachthof-Geb%20Hamburg&lang=de&limit=2' | jq '.features | map(.properties)'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   782    0   782    0     0   2821      0 --:--:-- --:--:-- --:--:--  2823
[
  {
    "osm_type": "R",
    "osm_id": 242616,
    "extent": [
      -51.1787892,
      -29.6378807,
      -50.9341246,
      -29.8302389
    ],
    "country": "Brasilien",
    "osm_key": "place",
    "countrycode": "BR",
    "osm_value": "city",
    "name": "Novo Hamburgo",
    "county": "Região Geográfica Intermediária de Porto Alegre",
    "state": "Rio Grande do Sul",
    "type": "city"
  },
  {
    "osm_type": "N",
    "osm_id": 1680910488,
    "country": "Deutschland",
    "osm_key": "railway",
    "city": "Hamburg",
    "street": "Nordsteg",
    "countrycode": "DE",
    "district": "St. Georg",
    "osm_value": "station",
    "postcode": "20099",
    "name": "Hamburg Hauptbahnhof",
    "type": "house"
  }
]

❯ curl 'https://photon.komoot.io/api/?q=Hamburg&lang=de&limit=1' | jq '.features[0].properties'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   332    0   332    0     0    277      0 --:--:--  0:00:01 --:--:--   277
{
  "osm_type": "R",
  "osm_id": 62782,
  "extent": [
    8.1044993,
    54.02765,
    10.3252805,
    53.3951118
  ],
  "country": "Deutschland",
  "osm_key": "place",
  "countrycode": "DE",
  "osm_value": "city",
  "name": "Hamburg",
  "type": "city"
}
xeruf commented 1 year ago

Maybe a flag whether this query is for geocoding (treat query as complete) or for autocompletion would be useful.

lonvia commented 1 year ago

I can't reproduce that result. It might have been just a transient data error. Results always contain the city of Hamburg for me as first result.

xeruf commented 1 year ago

Just entered the exact command above again, still receiving Novo Hamburgo

lonvia commented 5 months ago

This is once more a problem with lang=default: https://photon.komoot.io/api/?q=Luftfrachthof-Geb%20Hamburg&limit=1&lang=default

lonvia commented 5 months ago

I've pushed an experimental version of Photon to https://photon.komoot.io which reduces the boost for default slightly. Lets see if it causes other complaints of results becoming worse.

As a general rule, sending a lang= parameter with your queries when using them from scripts, is probably always a good idea.