PabloRMira / sql_formatter

A Python based SQL formatter
https://pablormira.github.io/sql_formatter/
Apache License 2.0
43 stars 10 forks source link

Format Query within Column List #170

Open kfordaccela opened 3 years ago

kfordaccela commented 3 years ago

Describe the bug

When formatting a query that has a query within the column list

To Reproduce

python .from sql_formatter.core import format_sql

print( format_sql( "SELECT Column1, Column2, (SELECT column3 from SubTable) Column 4 FROM TABLE" ) )

output: SELECT_column1, column2, (SELECT_column3 _FROM_subtable)column4 FROMtable

Expected behavior

SELECT column1, column2, ( __SELECT __column3 __FROM_subtable ____)_column4 FROM table

Screenshots

Sorry for "_" but the 'code' in the editor was dropping formatting.

PabloRMira commented 3 years ago

Hi @kfordaccela , actually I'm still wondering if such a query is valid in the first place. What SQL dialect are you using?

What if subtable has more rows than table?

As a workaround I would propose you the more classical way with a join, e.g.


SELECT a.column1, 
       a.column2, 
       b.column3
FROM table as a
    LEFT JOIN subtable as b
        ON a.column1 = b.column1
kfordaccela commented 3 years ago

I have been writing SQL for a number of years and recognize the fact that it would have been better to use a LEFT JOIN, however often in the use case that I provided the result would be a sum or an average that would have required a grouping or a partition by that the original coder wasn't in favor of for some reason so they did it as a sub-query within the list of columns to be pulled into the report.

I have been playing also with the following as a comparison, which does do the formatting as I suggested above. https://zeroturnaround.github.io/sql-formatter/

The whole reason for carefully formatting the code for me is the fact that I'm using it to parse out the tables and columns that are used within the query so that I can create a list for further work within python. When formatted properly I'm able to find the table aliases much faster.