etianen / django-watson

Full-text multi-table search application for Django. Easy to install and use, with good performance.
BSD 3-Clause "New" or "Revised" License
1.2k stars 129 forks source link

slash (/) not treated as word separator #307

Closed supern8ent closed 1 month ago

supern8ent commented 3 months ago

First, thanks for creating this plugin and for continuing to maintain it! It was really easy to add to my project and it does a great job.

The issue I wanted to raise is that I noticed that something like "pets/companions" isn't searchable like I would have expected: searching for "companion" won't find the record, and I think that ideally it should.

As a language-independent test, I put "aadjsk/ddjjeeef" in a text field. Searching "aadjsk" with Watson finds the record, but searching "ddjjeeef" does not. Although I haven't done the research to understand the watson_searchentry.search_tsv column, I can see "aadjsk/ddjjeeef" in that field, but neither word independently.

As a more practical test, I compared "pets/companions" with "pets companions". The former only puts "pets/companions" in search_tsv, whereas the latter puts "pet" and "companion" in as separate keys in search_tsv. So it appears that the reason for all this is that the / character is not being treated as a word separator the way spaces and newlines are.

Sorry if I missed it, but I didn't find any option/configuration that pertains to this question. Is there a way modify this behavior? This also got me wondering what characters besides space and newline are treated as word separators. I tried "free-loader" and that put "free", "free-load", and "loader" in search_tsv.

etianen commented 2 months ago

Which database backend are you using? The capabilities of django-watson are heavily dependent on that choice:

https://github.com/etianen/django-watson/wiki/Database-support

supern8ent commented 2 months ago

I'm using Postgresql 15.

etianen commented 1 month ago

You can subclass PostgresSearchBackend and override escape_postgres_query:

https://github.com/etianen/django-watson/blob/master/watson/backends.py#L178C9-L178C30

You then need to provide the path to your subclass to the WATSON_BACKEND setting