SQLFTW / sqlftw

SQL lexer, parser, model and static analysis in PHP (for MySQL dialect)
21 stars 2 forks source link

reference to function expression in count #15

Closed staabm closed 1 year ago

staabm commented 1 year ago

I am running the query 'SELECT count(email) from ada' thru this code:

        $queryString = 'SELECT count(email) from ada';

        $platform = Platform::get(Platform::MYSQL, '8.0'); // version defaults to x.x.99 when no patch number is given
        $session = new Session($platform);
        $parser = new Parser($session);

        // returns a Generator. will not parse anything if you don't iterate over it
        $commands = $parser->parse($queryString);
        foreach ($commands as [$command, $tokenList, $start, $end]) {
            // Parser does not throw exceptions. this allows to parse partially invalid code and not fail on first error
            if ($command instanceof SelectCommand) {
                $columns = $command->getColumns();
                foreach($columns as $i => $column) {
                    $expression = $column->getExpression();
                    if ($expression instanceof FunctionCall && $expression->getFunction()->getName() == BuiltInFunction::COUNT ) {
                        // $expression->
                    }
                }
            }
        }

which when hold with the debugger at // $expression gives me this $command:

grafik

from the command I have at hand, I can easily find the SimpleName of the table involved, and also the name of the arg passed to count. what I am missing is way to retrieve the raw count(email) expression, so I can combine the parser results with results of other tooling involved.

staabm commented 1 year ago

in https://github.com/staabm/phpstan-dba/pull/505 I was able to work arround the missing raw-expression

paranoiq commented 1 year ago

hi. i think i understand what you need, but it is not possible/reliable for now. all expression nodes implement SqlSerializable but, at this point model does not support whitespace and normalizes letter case on things like keywords and built in object names. so there is not guaranteed, that the serialized version matches exactly the original - e.g. COUNT( foo ) vs count(foo)

i would like to implement this later, but it is quite hard to do. right now the SQL model is minimal as can be and lot of scalars would have to be converted to expression nodes so they can hold offsets and/or original tokens

paranoiq commented 1 year ago

maybe i will try to provide offsets to the original SQL string as a part of SelectExpression, solely as a temporary solution to this problem with keys. that might be pretty simple