etianen / django-watson

Full-text multi-table search application for Django. Easy to install and use, with good performance.
BSD 3-Clause "New" or "Revised" License
1.2k stars 129 forks source link

Some search terms seem to be ignored #262

Closed jmb closed 4 years ago

jmb commented 4 years ago

I have got watson set up for my project, however sometimes it fails to give me any results when I think it should. For example:

>>> watson.search("lie")
<QuerySet []>
>>> watson.search("det")
<QuerySet [<SearchEntry: 1:29:15 The 7 Card Lie Detector>, <SearchEntry: 1:53:25 The 7 Card Lie Detector>, <SearchEntry: 2:07:25 Question: Clarification on spelling of suits for The 7 Card Lie Detector>]>
>>>
>>> watson.search("Across")
<QuerySet []>
>>> watson.search("awesome")
<QuerySet [<SearchEntry: 0:25:25 2. Basic Coins Across - A simple but awesome traveling coins effect>]>
>>> watson.search("awesome").first().content
'2.\tBasic Coins Across - A simple but awesome traveling coins effect'
>>> watson.search("awesome").first().title
'0:25:25 2.\tBasic Coins Across - A simple but awesome traveling coins effect'
>>> watson.search("awesome").first().description
''

Can anyone shed any light on why would this happen?

jmb commented 4 years ago

I should add that I am using MySQL as my database.

etianen commented 4 years ago

No idea, sorry. If you can identify an issue, I'll take an MR.

On Wed, 27 Nov 2019 at 14:58, Jonathan Batchelor notifications@github.com wrote:

I should add that I am using MySQL as my database.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/etianen/django-watson/issues/262?email_source=notifications&email_token=AABEKCDTUJRXDZX3VCPEQ6DQV2DJ7A5CNFSM4JSIAINKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFJYPGQ#issuecomment-559122330, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABEKCB4L2O4JDB4XDBSR6DQV2DJ7ANCNFSM4JSIAINA .

jmb commented 4 years ago

Thanks. I've found the SQL in your code and am trying it directly... SELECT * FROM watson_searchentry WHERE MATCH (title, description, content) AGAINST ('across' IN BOOLEAN MODE); ...the same issues arise, so it's obviously a MySQL issue. Weird!

jmb commented 4 years ago

Ahah, I've found the issues. The default minimum word length is 4 and there is a list of "stop words": https://dev.mysql.com/doc/refman/8.0/en/fulltext-stopwords.html#fulltext-stopwords-stopwords-for-myisam-search-indexes