gwenn / lemon-rs

LALR(1) parser generator for Rust based on Lemon + SQL parser
The Unlicense
48 stars 10 forks source link

Panic on LEFT OUTER JOIN #23

Closed psarna closed 8 months ago

psarna commented 1 year ago

Hi! The following legal statement in sqlite:

SELECT * FROM sqlite_master a LEFT OUTER JOIN sqlite_master b;

causes the parser to panic on unreachable code:

2023-03-07T13:53:10.036847Z DEBUG scanner: scan(line: 1, column: 41)    
2023-03-07T13:53:10.036856Z DEBUG scanner: consume(1)    
2023-03-07T13:53:10.036866Z DEBUG scanner: consume(4)    
2023-03-07T13:53:10.036875Z DEBUG sqlite3Parser: Input 'JOIN' with pending reduce 32    
2023-03-07T13:53:10.036883Z DEBUG sqlite3Parser: Reduce 32 [nm ::= JOIN_KW], pop back to state 235.    
2023-03-07T13:53:10.036893Z DEBUG sqlite3Parser: ... then shift 'nm', go to state 234    
2023-03-07T13:53:10.036902Z DEBUG sqlite3Parser: Shift 'JOIN', pending reduce Some(142)    
2023-03-07T13:53:10.036929Z DEBUG sqlite3Parser: Return. Stack=[SELECT distinct selcollist FROM seltablist JOIN_KW nm JOIN]    
2023-03-07T13:53:10.036941Z DEBUG scanner: scan(line: 1, column: 46)    
2023-03-07T13:53:10.036950Z DEBUG scanner: consume(1)    
2023-03-07T13:53:10.036962Z DEBUG scanner: consume(13)    
2023-03-07T13:53:10.036971Z DEBUG sqlite3Parser: Input 'ID' with pending reduce 142    
2023-03-07T13:53:10.036980Z DEBUG sqlite3Parser: Reduce 142 [joinop ::= JOIN_KW nm JOIN], pop back to state 278.    
thread 'tokio-runtime-worker' panicked at 'internal error: entered unreachable code', /home/sarna/.cargo/registry/src/github.com-1ecc6299db9ec823/sqlite3-parser-0.6.0/src/parser/ast/mod.rs:1712:17
gwenn commented 1 year ago

I cannot reproduce with your example:

% cargo run --example sql_cmd
     Running `target/debug/examples/sql_cmd`
SELECT * FROM sqlite_master a LEFT OUTER JOIN sqlite_master b;
psarna commented 1 year ago

Hm, maybe I went too far with minimizing it, let me double-check

psarna commented 1 year ago

I'm confused, it fails in our sqld and points to this unreachable instruction, but I also see it work just fine with the sql_cmd example.

psarna commented 1 year ago

@gwenn hey I think this is an issue with the flavor of the parser that takes ownership of its data. I realized that in my experiments I used that flavor. I modified the example by adding a to_vec():

use fallible_iterator::FallibleIterator;
use sqlite3_parser::lexer::sql::Parser;

/// Parse a string.
// RUST_LOG=sqlite3Parser=debug
fn main() {
    env_logger::init();
    let arg = "PRAGMA parser_trace=ON;";
    let mut parser = Parser::new(arg.as_bytes().to_vec());
    loop {
        match parser.next() {
            Ok(None) => break,
            Err(err) => {
                eprintln!("Err: {err} in {arg}");
                break;
            }
            Ok(Some(cmd)) => {
                println!("{cmd}");
            }
        }
    }
}

... and then the example returns garbage:

[sarna@sarna-pc lemon-rs]$ RUST_LOG=info cargo run --release --example sql_cmd
    Finished release [optimized] target(s) in 0.02s
     Running `target/release/examples/sql_cmd`
PRAGMA =ON;er_trace = ON;
gwenn commented 1 year ago

Thanks for your investigation.

gwenn commented 1 year ago

The issue should be related to this unsafe block: https://github.com/gwenn/lemon-rs/blob/master/src/lexer/scan.rs#L270-L271

gwenn commented 1 year ago

See https://github.com/gwenn/lemon-rs/pull/26/files#diff-3f786ccab1066df60ecf2d56744fa4fbc9b94efe78a12f66d0a5f986e81038baR100 (no unsafe / transmute anymore but no streaming anymore)

gwenn commented 1 year ago

@psarna, @MarinPostma could you please check / review PR #26.

gwenn commented 1 year ago

Version 0.7.0 released