andialbrecht / sqlparse

A non-validating SQL parser module for Python
BSD 3-Clause "New" or "Revised" License
3.73k stars 695 forks source link

Parser does not recognize function calls #391

Open pensnarik opened 6 years ago

pensnarik commented 6 years ago

Parser does not recognize function calls. It interprets it as tokens with None type. Query example:

SELECT * FROM schema_name.function_name(arg1, arg2);
nickolay commented 6 years ago

sqlparse does recognize the function calls, but has an unusual setup where (if I got it right by reading through the code): 1) the regular tokens, resulting from tokenization in the usual sense, are of type Token, which has a ttype (punctuation, keyword, etc) and value 2) the parsing works by "grouping" the tokens into classes deriving from the TokenList type: Function, Case, etc, which form a tree-ish structure akin to an AST, that still contains all the tokens representing the original source at the leaf level.

What makes this confusing is that a TokenList happens to be a Token with ttype=None - for these 'tokens' you have to look at the object's class to determine what it is.

I do wonder if there are plans to change this or if it's worth documenting more clearly. @andialbrecht?

Additionally, in this case qualifying the function name with schema_name seems to confuse the parser:

import sqlparse; sqlparse.parse("SELECT * FROM schema_name.function_name(arg1, arg2);")[0]._pprint_tree()
...
 4 Keyword 'FROM'
 5 Whitespace ' '
 6 Identifier 'schema...'
 |  0 Name 'schema...'
 |  1 Punctuation '.'
 |  2 Function 'functi...'
 |  |  0 Name 'functi...'
 |  |  1 Parenthesis '(arg1,...'

vs

import sqlparse; sqlparse.parse("SELECT * FROM function_name(arg1, arg2);")[0]._pprint_tree()
...
 4 Keyword 'FROM'
 5 Whitespace ' '
 6 Function 'functi...'
 |  0 Identifier 'functi...'
 |  |  0 Name 'functi...'
 |  1 Parenthesis '(arg1,...'