TryGhost / NQL

MIT License
4 stars 8 forks source link

🐛 Fixed escaping of forward slash, percentage and underscore signs #70

Closed SimonBackx closed 11 months ago

SimonBackx commented 11 months ago

fixes https://github.com/TryGhost/Product/issues/3965

Context

When we parse and execute a NQL startWith/contains/endWith filter that contains a slash in SQL, the NQL string filter will get converted to a MongoDB query object, which uses a Regex to represent startsWith/contains/endWith filters. If we execute this NQL filter in SQL, this Mongo DB query object will get converted to called knex methods. So the Regex will eventually be converted to a SQL LIKE query.

1. SQLite: when a startsWith filter string contains /

refs https://ghost.slack.com/archives/C02G9E68C/p1695812152570319?thread_ts=1695741311.759409&cid=C02G9E68C

A forward slash is a special Regex character. When converting the NQL string to the MongoDB object, it will get escaped and stored as \/ in the Regex. Currently, when we convert the Regex to the LIKE query, we don't remove this escaping.

MySQL seems to ignore this (kinda incorrect). SQLite doesn't like it, and this breaks queries on SQLite that use slashes.

The solution here is simple: remove the backslash when converting the Regex to LIKE, just like we do with all the other special Regex characters that we currently unescape correctly.

2. We don't escape % and _, which have a special meaning in LIKE queries

Image

In the picture above, the query in the end is translated to something ± LIKE '%simo____%', which has a total different meaning in SQL. (_ means match any character) It should be something along the lines of LIKE '%simo\_\_\_\_%' As you can see, the results in the background of the screenshot are not correct.

To fix this, we should escape the % and _ characters, as explained in both the MySQL and SQLite documentation pages. A logical candidate for escapes would be to use the backslash character. However, this is not possible because we use binded parameters in Knex (for security considerations we should definitely not remove that), and knex will reescape backslashes for a second time.

whereRaw('?? LIKE ?`, ['href', `LIKE '%simo\\_\\_\\_\\_%'`]); // \\ is escaped in JS so the string only contains one backslash!

Will eventually send the following query to the server (note this isn't double escaped manually or in the logs):

WHERE `href` LIKE '%simo\\_\\_\\_\\_%'

To fix this, we can use a different ESCAPE character in both MySQL and SQLite.

whereRaw('?? LIKE ? ESCAPE ?`, ['href', `LIKE '%simo*_*_*_*_%'`, '*']); 
WHERE `href` LIKE '%simo*_*_*_*_%' ESCAPE '*'

This works as expected in MySQL:

Image

And in SQLite too:

Image

Using the escape character in filters also still works if we double escape them (so * = ** in the real query):

Image

Image

Filtering on a backslash works as expected:

Image

codecov-commenter commented 11 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (ca8ce31) 79.49% compared to head (9d36547) 79.71%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #70 +/- ## ========================================== + Coverage 79.49% 79.71% +0.22% ========================================== Files 9 9 Lines 1551 1568 +17 Branches 313 316 +3 ========================================== + Hits 1233 1250 +17 Misses 313 313 Partials 5 5 ``` | [Files](https://app.codecov.io/gh/TryGhost/NQL/pull/70?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=TryGhost) | Coverage Δ | | |---|---|---| | [packages/mongo-knex/lib/convertor.js](https://app.codecov.io/gh/TryGhost/NQL/pull/70?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=TryGhost#diff-cGFja2FnZXMvbW9uZ28ta25leC9saWIvY29udmVydG9yLmpz) | `86.82% <100.00%> (+0.38%)` | :arrow_up: |

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.