c2corg / v6_api

REST API for https://www.camptocamp.org
GNU Affero General Public License v3.0
22 stars 25 forks source link

Weird results when searching on title with elasticsearch #1697

Open desnoes opened 9 months ago

desnoes commented 9 months ago

The search on title (home page and topoguide search) provides weird results. For instance, the exact word match is often not ranked first. This is probably due to an incorrect configuration of the elasticsearch query in the function get_text_query_on_title() (repository: \v6_api\c2corg_api\search). The results of the search on ngrams and draw are boosted at the expence of the word match. I don't see the reason why the search is done on ngrams and draw !? One solution to be tested is to remove the fields in order to rely on a simple search on words. When the fuzziness parameter is set to 'auto', the search returns documents that contain terms similar to the search term (cf. https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-fuzzy-query.html). I tried to install the API environment on my laptop (windows 10) but it failed. One could try the following modification of the function get_text_query_on_title(). If it works fine, a proper modification should remove the search_lang variable (impact on UI code to be scrutinized).

def get_text_query_on_title(search_term, search_lang=None): return MultiMatch( query=search_term, fuzziness='auto', operator='and' )

loicperrin commented 4 months ago

Précisions fonctionnelles : https://forum.camptocamp.org/t/topoguide-recherche-par-chaine-de-caracteres-dans-le-titre-plus-assez-selective/319862/69

Quelques pistes :