weggli-rs / weggli

weggli is a fast and robust semantic search tool for C and C++ codebases. It is designed to help security researchers identify interesting functionality in large codebases.
Apache License 2.0
2.34k stars 130 forks source link

Failure generating query without { } #29

Closed 82marbag closed 2 years ago

82marbag commented 2 years ago

Hi. This query generates an error with tree sitter: weggli '{ _ $x; for (_;_;_) $x = _; }' .

Tree sitter query generation failed: Structure
                         (init_declarator declarator:(pointer_declarator declarator: [(identifier) (field_expression) (field_identifier) (scoped_identifier)] @2) value: [(cast_expression value: (_)) (_)])]) )
                        ^
sexpr: ((declaration type:(_) declarator:[(identifier) (field_expression) (field_identifier) (scoped_identifier)] @0) )((for_statement "for" @1 initializer:(_) condition:(_) update:(_) [(assignment_expression left: [(identifier) (field_expression) (field_identifier) (scoped_identifier)] @2 right: [(cast_expression value: (_)) (_)])
                        (init_declarator declarator: [(identifier) (field_expression) (field_identifier) (scoped_identifier)] @2 value: [(cast_expression value: (_)) (_)])
                        (init_declarator declarator:(pointer_declarator declarator: [(identifier) (field_expression) (field_identifier) (scoped_identifier)] @2) value: [(cast_expression value: (_)) (_)])]) )
This is a bug! Can't recover :/

With {} it can generate the query. weggli '{ _ $x; for (_;_;_) {$x = _;} }' .

This is not recognized,

void main() {
        int i;
        for ( i;i;i ) i = 0;
}

while this one is recognized

void main() {
        int i;
        for ( i;i;i ) {i = 0;}
}

Is the problem with the way tree sitter is used here? If it's with them, I'll forward this issue to them

felixwilhelm commented 2 years ago

Hi, thanks for the report! This looks like a weggli bug, I'll take a look.

sebiiV commented 2 years ago

Hey, Think i've got a similar issue

'if (!$x) $x=_;' falls over with

Tree sitter query generation failed: Structure
                         (init_declarator declarator:(pointer_declarator declarator: [(identifier) (field_expression) (field_identifier) (qualified_identifier) (this)] @2) value: [(cast_expression value: (_)) (_)])]) (#eq? @1 @2))
                        ^
sexpr: ((if_statement "if" @0 condition:(condition_clause value:(unary_expression operator:"!" argument:[(identifier) (field_expression) (field_identifier) (qualified_identifier) (this)] @1)) consequence:[(assignment_expression left: [(identifier) (field_expression) (field_identifier) (qualified_identifier) (this)] @2 right: [(cast_expression value: (_)) (_)])
                        (init_declarator declarator: [(identifier) (field_expression) (field_identifier) (qualified_identifier) (this)] @2 value: [(cast_expression value: (_)) (_)]) 
                        (init_declarator declarator:(pointer_declarator declarator: [(identifier) (field_expression) (field_identifier) (qualified_identifier) (this)] @2) value: [(cast_expression value: (_)) (_)])]) (#eq? @1 @2))

but works with 'if (!$x) {$x=_;}'

I'm aiming to catch one line if statements like below

    if (!foo)
        foo = bar;

If this is a seperate issue, I can make a new bug report.

felixwilhelm commented 2 years ago

Sorry for the long delay on this issue. This should be fixed with commit 01499e238c9b8e4af514dea1ff637c8e1846db18

Thanks for the detailed reports @sebiiV and @82marbag :)