antlr / grammars-v4

Grammars written for ANTLR v4; expectation that the grammars are free of actions.
MIT License
9.99k stars 3.68k forks source link

SQLite grammar: top level `parse` rule accepts statements not separated by semicolons #2827

Open juretta opened 1 year ago

juretta commented 1 year ago

The SQLite language syntax defines a sql-stmt-list that consists of zero, one or more sql-stmts. Each sql-stmt should be separated by a semicolon. See https://www.sqlite.org/lang.html.

Here is a simple example with SQLite (SQLite 3.37.0):

sqlite> SELECT Title FROM Album LIMIT 2; SELECT ArtistId FROM Album LIMIT 3;
For Those About To Rock We Salute You
Balls to the Wall
1
1
2

Omitting the first semicolon (and hence not properly separating the SELECT statement) leads to an error:

sqlite> SELECT Title FROM Album LIMIT 2 SELECT ArtistId FROM Album LIMIT 3;
Error: in prepare, near "SELECT": syntax error (1)

The ANTLR grammar currently accepts the second form but IMHO shouldn't.

Root cause

The grammar-v4 grammar for the SQLite SQL dialect is defined as follows:

https://github.com/antlr/grammars-v4/blob/07314e4615982ba77864d7b8cd804c7b5d803bb0/sql/sqlite/SQLiteParser.g4#L37-L42

So parse is also valid for a list of sql_stmt_lists that each consist of a single sql_stmt only, hence accepting the SQL statements not separated by semicolons.

Instead the parse rule should probably be:

parse: (sql_stmt_list)? EOF
;

which only accepts a single sql_stmt_list at most and hence requires sql_stmts to be separated by SCOL (semicolon).

juretta commented 1 year ago

Here is a change with accompanying tests: https://github.com/antlr/grammars-v4/compare/master...juretta:grammars-v4:master#diff-8bf47bc01d73b92fd763a6f79c0a9827e7006434ce463139e2f40cecc6a2ee1a

Changing parse causes the generated parser to omit a single (optional) sql_stmt_list instead of that being a list. Not entirely sure about the ramifications here, this seems to break all code that relies on the previous behaviour. So updating to the new grammar would be a breaking change.