partiql / partiql-lang-kotlin

PartiQL libraries and tools in Kotlin.
https://partiql.org/
Apache License 2.0
536 stars 60 forks source link

[v1] bag constructor allows for whitespace between angle brackets #1498

Open alancai98 opened 4 days ago

alancai98 commented 4 days ago

Something I noticed when trying to add some special operator parsing rules in the v1 branch is that it currently allows whitespace between the angle brackets:

Welcome to the PartiQL shell!
Typing mode: LEGACY
Using version: 1.0.0-SNAPSHOT-4dd09721
PartiQL> <<1>>;
===' 
<<
  1
>>
--- 
OK!
PartiQL> < < 1 > >;
===' 
<<
  1
>>
--- 
OK!

whereas the previous parsing behavior on main and all prior releases require the angle brackets to be together:

Welcome to the PartiQL shell!
Typing mode: LEGACY
Using version: 0.14.5-ab65143c
PartiQL> <<1>>;
==='
<<
  1
>>
---
OK!
PartiQL> < < 1 > >;

mismatched input '<' expecting {'ANY', 'AVG', 'BIT_LENGTH', 'CASE', 'CAST', 'CHARACTER_LENGTH', 'CHAR_LENGTH', 'COALESCE', 'COUNT', 'CREATE', 'CURRENT_DATE', 'CURRENT_USER', 'DATE', 'DELETE', 'DROP', 'EVERY', 'EXCLUDED', 'EXEC', 'EXISTS', 'EXPLAIN', 'EXTRACT', 'DATE_ADD', 'DATE_DIFF', 'FALSE', 'FROM', 'INSERT', 'LOWER', 'MAX', 'MIN', 'NOT', 'NULL', 'NULLIF', 'OCTET_LENGTH', 'OVERLAY', 'POSITION', 'REPLACE', 'SELECT', 'SET', 'SIZE', 'SOME', 'SUBSTRING', 'SUM', 'TIME', 'TIMESTAMP', 'TRIM', 'TRUE', 'UPDATE', 'UPPER', 'UPSERT', 'VALUES', 'LAG', 'LEAD', 'CAN_CAST', 'CAN_LOSSLESS_CAST', 'MISSING', 'PIVOT', 'REMOVE', 'LIST', 'SEXP', '+', '-', '@', '<<', '[', '{', '(', '?', LITERAL_STRING, LITERAL_INTEGER, LITERAL_DECIMAL, IDENTIFIER, IDENTIFIER_QUOTED, ION_CLOSURE}

Seems like https://github.com/partiql/partiql-lang-kotlin/pull/1449 inadvertently introduced this parsing change. Will need to confirm w/ team whether this behavior should exist in the parsing.

alancai98 commented 1 day ago

Above change in v1 also allows users to insert a comment in between the bag constructor angle brackets:

PartiQL> </* some comment */< 1 > >;
==='
<<
  1
>>
---
OK!

meanwhile in main and prior versions it is a parse error:

PartiQL> </* some comment */< 1 > >;

mismatched input '<' expecting {'ANY', 'AVG', 'BIT_LENGTH', 'CASE', 'CAST', 'CHARACTER_LENGTH', 'CHAR_LENGTH', 'COALESCE', 'COUNT', 'CREATE', 'CURRENT_DATE', 'CURRENT_USER', 'DATE', 'DELETE', 'DROP', 'EVERY', 'EXCLUDED', 'EXEC', 'EXISTS', 'EXPLAIN', 'EXTRACT', 'DATE_ADD', 'DATE_DIFF', 'FALSE', 'FROM', 'INSERT', 'LOWER', 'MAX', 'MIN', 'NOT', 'NULL', 'NULLIF', 'OCTET_LENGTH', 'OVERLAY', 'POSITION', 'REPLACE', 'SELECT', 'SET', 'SIZE', 'SOME', 'SUBSTRING', 'SUM', 'TIME', 'TIMESTAMP', 'TRIM', 'TRUE', 'UPDATE', 'UPPER', 'UPSERT', 'VALUES', 'LAG', 'LEAD', 'CAN_CAST', 'CAN_LOSSLESS_CAST', 'MISSING', 'PIVOT', 'REMOVE', 'LIST', 'SEXP', '+', '-', '@', '<<', '[', '{', '(', '?', LITERAL_STRING, LITERAL_INTEGER, LITERAL_DECIMAL, IDENTIFIER, IDENTIFIER_QUOTED, ION_CLOSURE}
alancai98 commented 1 day ago

If we deem this is unintended behavior, I have a fix as part of https://github.com/partiql/partiql-lang-kotlin/issues/1478, which looks at the hidden token stream for any hidden whitespace/comments.