Add an option to check the dupplicates: --check-duplicates=10
This will run geocoder tester as always, and for each query, after the tests on the expected fields, we'll check that no objects in the n first fields of the response are duplicates.
If the option is not there everything should run as usual.
The notion of a duplicate is something that the user can't differentiate, so we implemented something quite specific for qwant's display of the autocomplete's response:
for a poi, we consider the object's label + it's address
for the other objects only the label
For the moment this mechanism is quite hardcoded in get_label_for_duplicates, I'm completely open if you see a more generic way to do this.
The error log will be formatted like:
Duplicates found in the response
# Search was: indre
## Entry ('Reuilly (Indre) (Reuilly)', 'poi', 'Sentier des Tournelles (Reuilly)') has been found for:
label | id | type | osm_id | housenumber | street | postcode | city | country | lat | lon | addr | poi_types
———————————————————————————|—————————————————————|——————|————————|—————————————|————————|——————————|—————————|—————————|———————————————————|————————————————————|——————————————————————————————————|———————————
Reuilly (Indre) (Reuilly) | osm:node:1854248363 | poi | _ | _ | _ | 36260 | Reuilly | _ | 47.08530172468403 | 2.0474608578328177 | Sentier des Tournelles (Reuilly) | railway
Reuilly (Indre) (Reuilly) | osm:node:4498318505 | poi | _ | _ | _ | 36260 | Reuilly | _ | 47.08529686318019 | 2.047508718499927 | Sentier des Tournelles (Reuilly) | railway
## Entry ('Indre Oslofjord (Oslo)', 'poi', 'Tøyengata (Oslo)') has been found for:
label | id | type | osm_id | housenumber | street | postcode | city | country | lat | lon | addr | poi_types
————————————————————————|———————————————————|——————|————————|—————————————|————————|——————————|——————|—————————|———————————————————|————————————————————|——————————————————|———————————
Indre Oslofjord (Oslo) | osm:way:233882196 | poi | _ | _ | _ | _ | Oslo | _ | 59.91907628783925 | 10.771447863393677 | Tøyengata (Oslo) | garden
Indre Oslofjord (Oslo) | osm:way:233882197 | poi | _ | _ | _ | _ | Oslo | _ | 59.91908412491954 | 10.771563565673642 | Tøyengata (Oslo) | garden
This PR aim to close #44 (and follow https://github.com/QwantResearch/geocoder-tester/pull/26)
Add an option to check the dupplicates:
--check-duplicates=10
This will run geocoder tester as always, and for each query, after the tests on the expected fields, we'll check that no objects in the
n
first fields of the response are duplicates.If the option is not there everything should run as usual.
The notion of a duplicate is something that the user can't differentiate, so we implemented something quite specific for qwant's display of the autocomplete's response:
For the moment this mechanism is quite hardcoded in get_label_for_duplicates, I'm completely open if you see a more generic way to do this.
The error log will be formatted like: