geocoders / geocoder-tester

Run search queries against a geocoder that supports geocodejson spec.
Other
40 stars 23 forks source link

Test for duplicates #44

Open nlehuby opened 6 years ago

nlehuby commented 6 years ago

When you use your geocoder to perform autocomplete search, you don't want the results to include duplicates, because they are confusing for the user which will not know how to choose between them.

How can we use geocoder-tester to test that we don't have duplicate results ?

nlehuby commented 6 years ago

Here is a proposal about this:

We are not really satisfied with this solution, because

Any better idea in how to handle this ?

antoine-de commented 6 years ago

I created a PR on our fork to handle this: https://github.com/QwantResearch/geocoder-tester/pull/26

The idea is to add an option to check the duplicates: --check-dupplicates=10

This will run geocoder tester as always, and for each query, after the tests on the expected fields, we'll check that no objects in the response are duplicates.

The notion of a duplicate is something that the user can't differentiate, so we implemented something quite specific for qwant's display of the autocomplete's response:

For the moment this mechanism is quite hardcoded in get_label_for_dupplicates, we need to see how to make it more generic. But since it's an opt-in cli parameter, maybe we can first add this in the main geocoder-tester repository and makes it more generic if the need arises.

So this will add more test errors and the responses are formated like:

Duplicates found in the response
# Search was: indre
## Entry ('Reuilly (Indre) (Reuilly)', 'poi', 'Sentier des Tournelles (Reuilly)') has been found for:
           label           |         id          | type | osm_id | housenumber | street | postcode |  city   | country |        lat        |        lon         |               addr               | poi_types 
———————————————————————————|—————————————————————|——————|————————|—————————————|————————|——————————|—————————|—————————|———————————————————|————————————————————|——————————————————————————————————|———————————
 Reuilly (Indre) (Reuilly) | osm:node:1854248363 | poi  |   _    |      _      |   _    |  36260   | Reuilly |    _    | 47.08530172468403 | 2.0474608578328177 | Sentier des Tournelles (Reuilly) |  railway  
 Reuilly (Indre) (Reuilly) | osm:node:4498318505 | poi  |   _    |      _      |   _    |  36260   | Reuilly |    _    | 47.08529686318019 | 2.047508718499927  | Sentier des Tournelles (Reuilly) |  railway  

## Entry ('Indre Oslofjord (Oslo)', 'poi', 'Tøyengata (Oslo)') has been found for:
         label          |        id         | type | osm_id | housenumber | street | postcode | city | country |        lat        |        lon         |       addr       | poi_types 
————————————————————————|———————————————————|——————|————————|—————————————|————————|——————————|——————|—————————|———————————————————|————————————————————|——————————————————|———————————
 Indre Oslofjord (Oslo) | osm:way:233882196 | poi  |   _    |      _      |   _    |    _     | Oslo |    _    | 59.91907628783925 | 10.771447863393677 | Tøyengata (Oslo) |  garden   
 Indre Oslofjord (Oslo) | osm:way:233882197 | poi  |   _    |      _      |   _    |    _     | Oslo |    _    | 59.91908412491954 | 10.771563565673642 | Tøyengata (Oslo) |  garden   

would anyone be interested for this ? should I also make a PR on the central repository with this ?