etianen / django-watson

Full-text multi-table search application for Django. Easy to install and use, with good performance.
BSD 3-Clause "New" or "Revised" License
1.2k stars 129 forks source link

watson removes apostrophes from search term #37

Closed clime closed 10 years ago

clime commented 10 years ago

It seems that single apostrophes are removed from a search term before sending the query to db. Because of that names like "Montagne d'Argent" are not found when searching for "d'Argent".

etianen commented 10 years ago

Which database backend are you using? Their behaviours can vary a little.

clime commented 10 years ago

It is postgresql.

etianen commented 10 years ago

I've just updated MASTER with a fix. Please check that it works with your project. If so, I can push out a new release today.

On 17 February 2014 10:50, Michal Novotný notifications@github.com wrote:

It is postgresql.

— Reply to this email directly or view it on GitHubhttps://github.com/etianen/django-watson/issues/37#issuecomment-35246446 .

clime commented 10 years ago

Searching with apostrophes works now. Thanks! However it works in slightly curious fashion. This is content of connection.queries after performing the search (filter) query:

[{u'time': u'0.037', u'sql': u'SELECT "web_crag"."id", "web_crag"."country_id", "web_crag"."latitude", "web_crag"."longitude", "web_crag"."location_index", "web_crag"."date_created", "web_crag"."last_modified", "web_crag"."name", "web_crag"."normalized_name", "web_crag"."type", "web_crag"."description", "web_crag"."added_by_id" FROM "web_crag" , "watson_searchentry" WHERE (watson_searchentry.engine_slug = \'default\') AND (watson_searchentry.search_tsv @@ to_tsquery(\'pg_catalog.english\', \'d\'\'\'\'argent:*\')) AND (watson_searchentry.object_id_int = "web_crag"."id") AND (watson_searchentry.content_type_id = 9) LIMIT 21'}]

As you can see at to_tsquery(\'pg_catalog.english\', \'d\'\'\'\'argent:*\')) it seems that apostrophes are being double-escaped (there are four apostrophes instead of just two). Once in watson and then later somewhere in Django. It doesn't matter in the end because postrgresql splits the words on apostrophes so one or two is the same (not the same as zero though). In fact, I simply tried to remove apostrophe from escape_postgres_query_chars and it worked well. I suspect it is not really needed there.

etianen commented 10 years ago

I've simplified the query escaper as per your suggestion, and it seems to be passing all tests on all database backends. Assuming it's still working for you, I'll get a new release out today.

On 17 February 2014 13:22, Michal Novotný notifications@github.com wrote:

Searching with apostrophes works now. Thanks! However it works in slightly curious fashion. This is content of connection.queries after performing the search (filter) query:

[{u'time': u'0.037', u'sql': u'SELECT "web_crag"."id", "web_crag"."country_id", "web_crag"."latitude", "web_crag"."longitude", "web_crag"."location_index", "web_crag"."date_created", "web_crag"."last_modified", "web_crag"."name", "web_crag"."normalized_name", "web_crag"."type", "web_crag"."description", "web_crag"."added_by_id" FROM "web_crag" , "watson_searchentry" WHERE (watson_searchentry.engine_slug = \'default\') AND (watson_searchentry.search_tsv @@ to_tsquery(\'pg_catalog.english\', \'d\'\'\'\'argent:*\')) AND (watson_searchentry.object_id_int = "web_crag"."id") AND (watson_searchentry.content_type_id = 9) LIMIT 21'}]

As you can see at to_tsquery(\'pg_catalog.english\', \'d\'\'\'\'argent:*\')) it seems that apostrophes are being double-escaped. Once in watson and then later somewhere in Django. It doesn't matter in the end because postrgresql splits the words on apostrophes so one or two is the same (not the same as None though). In fact, I simply tried to remove apostrophe from escape_postgres_query_charsand it worked well. I suspect it is not really needed there.

— Reply to this email directly or view it on GitHubhttps://github.com/etianen/django-watson/issues/37#issuecomment-35256160 .

clime commented 10 years ago

Ye, that works as expected. Thank you:).

etianen commented 10 years ago

django-watson 1.1.3 is now up!

On 18 February 2014 12:37, Michal Novotný notifications@github.com wrote:

Ye, that works. Thank you:).

— Reply to this email directly or view it on GitHubhttps://github.com/etianen/django-watson/issues/37#issuecomment-35380062 .