etianen / django-watson

Full-text multi-table search application for Django. Easy to install and use, with good performance.
BSD 3-Clause "New" or "Revised" License
1.2k stars 129 forks source link

The search returns not only variants of the same word, but also words starting with the requested word #300

Closed ktyz1992 closed 1 year ago

ktyz1992 commented 1 year ago

A full-text search, in addition to results containing variants of the requested words, also returns the results of words starting with the requested one.

For example, I created three entities each containing three different words: 'Cat', 'Cats' and 'Caterpillar'. The search for 'Cat' and 'Cats' should return only them, but it also returns Caterpillar. Thanks to DjDT and pgadmin, I noticed that if in the SQL expression to_tsquery('$$cat$$:*') in the WHERE line is replaced with to_tsquery('$$cat$$:') (that is, remove the asterisk at the end), this solves the problem and the search returns only Cat and Cats.

I found the code that creates this string: https://github.com/etianen/django-watson/blob/14be5bc5bb598534f2a5472ae1d4672cee2fee52/watson/backends.py#L181 Maybe my problem has something to do with this question: https://github.com/OpenTechStrategies/streetcrm/issues/280 , but I can't figure out what to do. I use postgresql. I have forced WATSON_BACKEND = "watson.backends.PostgresSearchBackend". It didn't help Please help.

ktyz1992 commented 1 year ago

According to this (https://github.com/etianen/django-watson/issues/134) issue, you need to set WATSON_BACKEND = "watson.backends.PostgresLegacySearchBackend". However, the interpreter tells me django.core.exceptions.ImproperlyConfigured: Could not find a class named 'watson.backends' in 'PostgresLegacySearchBackend' and indeed I don't see it in the file. Is such an elementary option as switching between enabling and disabling search using postfix not implemented in the app?

ktyz1992 commented 1 year ago

I had to create a CustomPostgresSearchBackend class in my application by removing just one asterisk. Not a beautiful implementation and a lot of redundancy. Please describe this problem in the documentation to reduce the headache of users in the future.

etianen commented 1 year ago

You had to subclass a 147 LOC class and override a 7 LOC method to change from a prefix-matching behaviour to a word-stemming behavior, creating a working solution within a day. I'm sorry this isn't beautiful to you, and resulted in a headache.

This is totally on me. I wrote the code 11 years ago when I was considerably less experienced, and have been maintaining it in my limited free time ever since. Be assured that when I charge for my work, I hold myself to considerably higher standards.