andialbrecht / sqlparse

A non-validating SQL parser module for Python
BSD 3-Clause "New" or "Revised" License
3.71k stars 693 forks source link

format does not produce same output for queries involving CTE which only differ by extra spaces #782

Closed anuj-wisdom closed 2 months ago

anuj-wisdom commented 2 months ago

Describe the bug format with strip_whitespace=True generates different output depending on whether a space is present before the CTE statement closing paranthesis. These extra spaces in input query should not lead to different outputs for the formatted query in order to allow comparing such queries for equality.

To Reproduce In the following two examples, the only difference is that there is a space after col2 in query 1 and the formatted output retains the space whereas if no space is provided in query 2, formatted output doesn't contain the space either. Note that in both cases, the space between ( and select is removed.

>>> sqlparse.format("with cte as ( select * from t1 where col1 = col2 ) select * from cte", strip_whitespace=True)
'with cte as (select * from t1 where col1 = col2 ) select * from cte'

>>> sqlparse.format("with cte as ( select * from t1 where col1 = col2) select * from cte", strip_whitespace=True)
'with cte as (select * from t1 where col1 = col2) select * from cte'

The same issue can also be noticed when using the reindent=True option.

Expected behavior Expected to get the same output for both the cases. Ideally, the output of query 2 with no space.

Versions (please complete the following information):

Additional context

andialbrecht commented 2 months ago

Thanks for reporting and for the example!