andialbrecht / sqlparse

A non-validating SQL parser module for Python
BSD 3-Clause "New" or "Revised" License
3.76k stars 700 forks source link

Incorrect parsing of string aggregation with group ordering ("LISTAGG(...) WITHIN GROUP(ORDER BY ... )") #700

Open turekv opened 1 year ago

turekv commented 1 year ago

Expressions containing string aggregation with group ordering are parsed incorrectly. Example:

statement = parse("SELECT LISTAGG(attr,', ') WITHIN GROUP(ORDER BY attr) as column FROM table")
print(statement[0].tokens)

results in:

[<DML 'SELECT' at 0x225D838FB80>,
 <Whitespace ' ' at 0x225D83945E0>,
 <Identifier 'LISTAG...' at 0x225D83BAF90>,    # "LISTAGG(attr,', ') WITHIN"
 <Whitespace ' ' at 0x225D8385460>,
 <Identifier 'GROUP(...' at 0x225D83BAC80>,    # "GROUP(ORDER BY attr) as column"
 <Whitespace ' ' at 0x225D83A6DC0>,
 <Keyword 'FROM' at 0x225D83A6E20>,
 <Whitespace ' ' at 0x225D83A6E80>,
 <Keyword 'table' at 0x225D83A6EE0>]

I.e., according to sqlparse a table with two columns is to be returned by this SQL query:

However, this is not the case. The query returns a single column (LISTAGG(attr,', ') WITHIN GROUP(ORDER BY attr), alias: column).