appsignal / sql_lexer

Rust library to lex and sanitize SQL
Apache License 2.0
20 stars 2 forks source link

SQL queries with '#' character not being on a comment are not properly parsed #16

Closed luismiramirez closed 3 years ago

luismiramirez commented 3 years ago

When an SQL query contains the '#' character outside of the context of a comment, it's not correctly parsed.

Example:

SELECT table1.*, NULLIF((table2.bar #>> '{att1,att2}')::float, 0) AS alias1,
NULLIF((table2.bar #>> '{att1,att2}')::float, 0) * table1.foo AS alias2 FROM "table1"
LEFT JOIN table2 ON table2.id = (
SELECT id FROM table2 WHERE time <= table1.created_at ORDER BY time DESC LIMIT 1
) WHERE (
(table1.start_date <= '2021-08-30' AND (table1.end_date >= '2021-08-30' OR table1.end_date IS NULL))) AND "table1"."foo" = 2
ORDER BY "table1"."end_date" ASC, "table1"."id" DESC LIMIT 50"

It gets parsed as:

SELECT table1.*, NULLIF((table2.bar
SELECT id FROM table2
WHERE TIME <= table1.created_at
ORDER BY TIME DESC LIMIT 1
) WHERE "table1"."type" = $1 AND ((table1.start_date <= ? AND (table1.end_date >= ? OR table2.end_date IS NULL)))
AND "table1"."foo" = $2
ORDER BY "table1"."end_date" DESC, offers.id DESC LIMIT $3 OFFSET $4

The parser interprets the '#' as the beginning of a comment and will omit it until it considers the comment has ended.

'#>>' is a valid operator for Postgres' JSON queries.