zhouqingqing / qpmodel

A Relational Optimizer and Executor
MIT License
64 stars 18 forks source link

[bug] parser silently truncate unrecognized part #200

Closed zhouqingqing closed 3 years ago

zhouqingqing commented 3 years ago
            sql = "select* from a outer join b on(a1 <> b1) where 100 > null";

parser will actually parse "select * from a" as we need left|right|full prefix for outer.

zhouruiapple commented 3 years ago

Silently truncate unrecognized part is the default behavior of Antlr4. For example: select * hello world is unrecognized part Only "select *" is parsed and no exception is thrown. I think this is not a bug of Antlr4.

The join rule of syntax from our SQLite.g4 is:

join_clause
 : table_or_subquery ( join_operator table_or_subquery join_constraint )*
 ;

join_operator
 : K_NATURAL? ( K_LEFT K_OUTER? | K_RIGHT K_OUTER? | K_FULL K_OUTER? | K_INNER | K_CROSS )? K_JOIN
 ;

The semantics is that outer cannot appear alone and must be qualified by left, right, and full. Otherwise, all rules after the join operation cannot be matched.

What is the expected behavior for us? If we expect an exception to be thrown for an unrecognized part, we can verify the length of the parsed sql. The exception is thrown from @SQLparser.cs not from the code which is generate by Antlr4.

If my understanding is correct, I will fix this bug based on the above understanding.

zhouqingqing commented 3 years ago

What is the expected behavior for us?

We can follow PostgreSQL's behavior: it complains if anything before ';' not recognizable.

we can verify the length of the parsed sql

Will this work if I have some space after ';'?

zhouruiapple commented 3 years ago

@zhouqingqing Got it. I'll give a practical solution to resolve the problem(above just suggestion).