etianen / django-watson

Full-text multi-table search application for Django. Easy to install and use, with good performance.
BSD 3-Clause "New" or "Revised" License
1.2k stars 130 forks source link

Better special characters handling in query escaping #161

Closed amureki closed 8 years ago

amureki commented 8 years ago

Based on PR https://github.com/etianen/django-watson/pull/155

Just updated harmful characters in regexp with MySQL ones, updated tests.

amureki commented 8 years ago

@etianen yeah, I think you right, better to replace it with spaces. Made fix, waiting for travis checks.

etianen commented 8 years ago

Lovely stuff! Thanks to all involved!

JirkaV commented 8 years ago

Thanks all! I'm currently pretty busy at $JOB, so could not make proposed MySQL changes. Thanks @amureki for picking it up!

amureki commented 8 years ago

@JirkaV mostly all thanks goes for you, I just wanted to get this as soon as its possible and made small changes. :) btw, @etianen when do you think, you'll release a new version with all last fixes?

JirkaV commented 8 years ago

For the record, the changed version removes some safety belt I put in. If someone searches for a text starting or ending with "&" (which is an allowed char), the search will raise an exception (at least on Postgres). I specifically put in a line of code to prevent that, which was removed by @amureki

I'd prefer making MySQL-specific version of escape_query() than sacrificing safety, but don't currently have time to do that, so won't complain too much here ;-)

etianen commented 8 years ago

Hmm, won't the current code still remove the & and replace it with a space? Potentially a call to strip() once the character blacklist has been applied would fix this? On Thu, 14 Apr 2016 at 11:34, Jirka Vejrazka notifications@github.com wrote:

For the record, the changed version removes some safety belt I put in. If someone searches for a text starting or ending with "&" (which is an allowed char), the search will raise an exception (at least on Postgres). I specifically put in a line of code to prevent that, which was removed by @amureki https://github.com/amureki

I'd prefer making MySQL-specific version of escape_query() than sacrificing safety, but don't currently have time to do that, so won't complain too much here ;-)

— You are receiving this because you were mentioned.

Reply to this email directly or view it on GitHub https://github.com/etianen/django-watson/pull/161#issuecomment-209871729

etianen commented 8 years ago

Oh, and I can push a new release when I'm back in the office on Monday. On Thu, 14 Apr 2016 at 18:04, Dave Hall dave@etianen.com wrote:

Hmm, won't the current code still remove the & and replace it with a space? Potentially a call to strip() once the character blacklist has been applied would fix this? On Thu, 14 Apr 2016 at 11:34, Jirka Vejrazka notifications@github.com wrote:

For the record, the changed version removes some safety belt I put in. If someone searches for a text starting or ending with "&" (which is an allowed char), the search will raise an exception (at least on Postgres). I specifically put in a line of code to prevent that, which was removed by @amureki https://github.com/amureki

I'd prefer making MySQL-specific version of escape_query() than sacrificing safety, but don't currently have time to do that, so won't complain too much here ;-)

— You are receiving this because you were mentioned.

Reply to this email directly or view it on GitHub https://github.com/etianen/django-watson/pull/161#issuecomment-209871729

JirkaV commented 8 years ago

"Hmm, won't the current code still remove the & and replace it with a space?"

No, the "&" character is one of the "good ones" that are left untouched. So if a user provides it at the beginning or end of their search query, they get an exception. The original pull request accounted for that (and some other edge cases).

amureki commented 8 years ago

@JirkaV so, can you, please, tell once again what other edge cases were there? So if you don't have time, I can fix it back and also that one with "&" symbol.

JirkaV commented 8 years ago

Absolutely. Don't get me wrong, I'm happy that you converted my pull request to a working version. I only had a chance to fine-tune it for PostgreSQL. The tests were not broken on MySQL so I assumed it was good enough.

Anyway, if you do something like this in PostgreSQL select to_tsquery('aaa&'), you get error "no operand in tsquery". So the query that's being sent to the DB may not start or end with "&". There was a special line in my original code handling that - basically a simple strip() call.

HTH

Jirka