Geoplateforme / geoplateforme.github.io

Site d'accueil de la documentation Géoplateforme
https://geoplateforme.github.io/
6 stars 1 forks source link

Des résultats incohérents avec la recherche #41

Open pYassine opened 4 months ago

pYassine commented 4 months ago

Type de diffusion concernée

Service de géocodage

Accès concerné

Ouvert

Description de l'erreur

Bonjour,

J'utilise l'api de géocodage depuis peu. Plusieurs utilisateurs nous ont remonté que des résultats ne remontaient pas

Quand ont tape "20 boulevard Mat" Aucun résultat cohérent ne remonte. Les utilisateurs cherchent à trouver l'adresse "20 boulevard matabiau"

Par contre quand on tape "20 boulevard mata", ça remonte bien.

Capture d’écran 2024-03-07 à 12 51 32

jdesboeufs commented 4 months ago

Bonjour @pYassine et merci pour ce signalement.

Je constate que le comportement s'observe aussi sur l'API Adresse : https://api-adresse.data.gouv.fr/search?q=20%20boulevard%20Mat

Cela est lié au mécanisme d'auto-complétion dans le moteur addok. Il semble ici que le terme mat soit réduit à ma à l'étage de tokenisation, terme ensuite trop court pour être utilisé pour l'auto-complétion qui se déclenche à 3 caractères.

Je vais créer un ticket sur le moteur en question.


Pour suivi :

> EXPLAIN 20 boulevard Mat
[123.9] Taken tokens: [<Token ma>]
[123.9] Common tokens: [<Token boulevar>]
[123.9] Housenumbers token: [<Token ving>]
[123.9] Not found tokens: []
[123.9] Filters: []
[124.0] ** ONLY_COMMONS_BUT_GEOHASH_TRY_AUTOCOMPLETE_COLLECTOR **
[124.0] ** NO_TOKENS_BUT_HOUSENUMBERS_AND_GEOHASH **
[124.0] ** NO_AVAILABLE_TOKENS_ABORT **
[124.0] ** ONLY_COMMONS **
[124.0] ** NO_MEANINGFUL_BUT_COMMON_TRY_AUTOCOMPLETE_COLLECTOR **
[124.0] ** ONLY_COMMONS_TRY_AUTOCOMPLETE_COLLECTOR **
[124.0] ** BUCKET_WITH_MEANINGFUL **
[124.0] New bucket with keys ['w|ma', 'w|boulevar'] and limit 10
[124.7] 4 ids in bucket so far
[124.7] ** REDUCE_WITH_OTHER_COMMONS **
[124.7] ** ENSURE_GEOHASH_RESULTS_ARE_INCLUDED_IF_CENTER_IS_GIVEN **
[124.7] ** AUTOCOMPLETE_MEANINGFUL_COLLECTOR **
[124.8] Autocompleting ma
[124.8] No candidates. Aborting.
[124.8] ** FUZZY_COLLECTOR **
[124.8] Checking cream.
[124.8] Computing results
[124.9] Done getting results data
[126.4] Done computing results
[126.4] Checking cream.
[126.4] Computing results
[126.4] Done computing results
[126.4] Fuzzy on. Trying with [<Token ma>, <Token boulevar>].
[126.4] Going fuzzy with boulevar and ['w|ma']
[129.2] Going fuzzy with ma and ['w|boulevar']
[129.6] Found fuzzy candidates ['la', 'pa', 'me', 'mz', 'mai', 'mak', 'man', 'mar', 'mas', 'mat', 'mau', 'max']
[129.6] Adding to bucket with keys ['w|boulevar', 'w|la']
[133.0] 100 ids in bucket so far
[133.0] ** EXTEND_RESULTS_EXTRAPOLING_RELATIONS **
[133.0] ** EXTEND_RESULTS_REDUCING_TOKENS **
[133.0] Computing results
[133.0] Done getting results data
[162.6] Done computing results
20 Boulevard Foch 62120 Aire-sur-la-Lys (97jWY | importance: 0.0633/0.1, str_distance: 0.5684/1.0)
20 Boulevard Doret 97400 Saint-Denis (ormyL | importance: 0.0674/0.1, str_distance: 0.54/1.0)
20 Boulevard Brune 19100 Brive-la-Gaillarde (ZYqov | importance: 0.0667/0.1, str_distance: 0.54/1.0)
20 Boulevard sully 85000 La Roche-sur-Yon (nWBAP | importance: 0.0611/0.1, str_distance: 0.54/1.0)
20 Boulevard National 92250 La Garenne-Colombes (4j3rg | importance: 0.065/0.1, str_distance: 0.5318/1.0)
20 Boulevard  Latouche 72200 La Flèche (kNNZK | importance: 0.0629/0.1, str_distance: 0.5318/1.0)
20 Boulevard Carnot 78200 Mantes-la-Jolie (zwPDq | importance: 0.0678/0.1, str_distance: 0.5143/1.0)
Boulevard du Mas 83700 Saint-Raphaël (jWBwl | importance: 0.0601/0.1, str_distance: 0.5211/1.0)
20 Boulevard edison 85000 La Roche-sur-Yon (jwR5v | importance: 0.0601/0.1, str_distance: 0.5143/1.0)
20 Boulevard Mestadier 23300 La Souterraine (J8v72 | importance: 0.0541/0.1, str_distance: 0.5087/1.0)
162.8 ms — 1 run(s) — 10 results
> EXPLAIN 20 boulevard Mata
[1254.] Taken tokens: [<Token mata>]
[1254.] Common tokens: [<Token boulevar>]
[1254.] Housenumbers token: [<Token ving>]
[1254.] Not found tokens: []
[1254.] Filters: []
[1254.] ** ONLY_COMMONS_BUT_GEOHASH_TRY_AUTOCOMPLETE_COLLECTOR **
[1254.] ** NO_TOKENS_BUT_HOUSENUMBERS_AND_GEOHASH **
[1254.] ** NO_AVAILABLE_TOKENS_ABORT **
[1254.] ** ONLY_COMMONS **
[1254.] ** NO_MEANINGFUL_BUT_COMMON_TRY_AUTOCOMPLETE_COLLECTOR **
[1254.] ** ONLY_COMMONS_TRY_AUTOCOMPLETE_COLLECTOR **
[1254.] ** BUCKET_WITH_MEANINGFUL **
[1254.] New bucket with keys ['w|mata', 'w|boulevar'] and limit 10
[1254.] 2 ids in bucket so far
[1254.] ** REDUCE_WITH_OTHER_COMMONS **
[1254.] ** ENSURE_GEOHASH_RESULTS_ARE_INCLUDED_IF_CENTER_IS_GIVEN **
[1254.] ** AUTOCOMPLETE_MEANINGFUL_COLLECTOR **
[1254.] Autocompleting mata
[1255.] Ordering candidates by frequency
[1255.] Found tokens to autocomplete [b'w|matalon, w|matabiau', …]
[1255.] Trying to extend bucket. Autocomplete w|matalon
[1255.] Adding to bucket with keys ['w|boulevar', 'w|matalon']
[1259.] 3 ids in bucket so far
[1259.] Trying to extend bucket. Autocomplete w|matabiau
[1259.] Adding to bucket with keys ['w|boulevar', 'w|matabiau']
[1259.] 4 ids in bucket so far
[1259.] ** FUZZY_COLLECTOR **
[1259.] Checking cream.
[1259.] Computing results
[1259.] Done getting results data
[1261.] Done computing results
[1261.] ** EXTEND_RESULTS_EXTRAPOLING_RELATIONS **
[1261.] No relation extrapolated.
[1261.] ** EXTEND_RESULTS_REDUCING_TOKENS **
[1261.] Checking cream.
[1261.] Computing results
[1261.] Done computing results
[1261.] Computing results
[1261.] Done computing results
20 Boulevard Matabiau 31000 Toulouse (AQPpp | importance: 0.0816/0.1, str_distance: 0.9/1.0)
Boulevard Bossais 17160 Matha (jZ60y | importance: 0.0454/0.1, str_distance: 0.468/1.0)
20 Boulevard de Saint Hérié 17160 Matha (r8VXK | importance: 0.0484/0.1, str_distance: 0.4091/1.0)
Boulevard rabatau daniel matalon 13010 Marseille (3lWyx | importance: 0.0681/0.1, str_distance: 0.3441/1.0)
1261.8 ms — 1 run(s) — 4 results
> TOKENIZE mat
ma