andialbrecht / sqlparse

A non-validating SQL parser module for Python
BSD 3-Clause "New" or "Revised" License
3.73k stars 695 forks source link

[Feature Request] Better keywords control #590

Open kavimaluskam opened 4 years ago

kavimaluskam commented 4 years ago

Problem

Common keywords in SQL query like uuid, private, version are identified as keywords and affected tokenization result.

Proposal

If user can choose the type of SQL engine they are using, e.g.

tokens = sqlparse.parse(sql, engine='postgresql')[0]

so is_keyword function will only use common_keywords and psql_keywords during the parsing

silpheel commented 3 years ago

Similarly, identifiers like box which are not reserved in MySQL but are e.g. in PostgreSQL get capitalized, so I'd also like some way to specify the engine if possible. Meanwhile, back ticks can be used to work around this.

hychen20 commented 3 years ago

Similar issue here, final is recognized as a keyword, but it shouldn't be given it is in MySQL. For example:

import sqlparse
sql = """
with final as (
    select * from test
) select * from final;
"""
stmt = sqlparse.parse(sql)[0]
print(stmt.tokens)

Output:

[<Newline ' ' at 0x7FB98003BAC0>, <CTE 'with' at 0x7FB98019EDC0>, <Whitespace ' ' at 0x7FB98019EE20>, <Keyword 'final' at 0x7FB98019EE80>, <Whitespace ' ' at 0x7FB98019EEE0>, <Keyword 'as' at 0x7FB98019EF40>, <Whitespace ' ' at 0x7FB98019EFA0>, <Parenthesis '( ...' at 0x7FB980160F90>, <Whitespace ' ' at 0x7FB9801A65E0>, <DML 'select' at 0x7FB9801A6640>, <Whitespace ' ' at 0x7FB9801A66A0>, <Wildcard '*' at 0x7FB9801A6700>, <Whitespace ' ' at 0x7FB9801A6760>, <Keyword 'from' at 0x7FB9801A67C0>, <Whitespace ' ' at 0x7FB9801A6820>, <Keyword 'final' at 0x7FB9801A6880>, <Punctuation ';' at 0x7FB9801A68E0>]