JakeWheat / simple-sql-parser

SQL parser in Haskell
http://jakewheat.github.io/simple-sql-parser/latest
Other
82 stars 29 forks source link

Bug: Brackets required in generated select statement. #57

Open hanjoosten opened 4 days ago

hanjoosten commented 4 days ago

I use this library to programmatically generate SQL. Recently, I am adding a feature in which I need to use recursive common table expressions. Fortunately, you support this with the With constructor of data type QueryExpr. However, if there are more than one elements in the qeViews list, the generated sql forgets to put brackets when used as subquery in a QueryExprSetOp.

Currently, the query is output as:

  select SrcA as src, TgtA as tgt
  from r
  where (SrcA = '_SRCATOM')
        and ((SrcA is not null)
             and (TgtA is not null))
  union distinct
  with recursive TheExpression as (/* EDcD r[A*A] */
                                   /*    Expression: r [A*A] */
                                   /*    Signature : [A*A] */
                                   select SrcA as src, TgtA as tgt
                                   from r
                                   where (SrcA is not null)
                                         and (TgtA is not null)),
       TransitiveClosure as (select src as src, tgt as tgt from TheExpression
                             union distinct
                             select TransitiveClosure.src as src,
                                    TheExpression.tgt as tgt
                             from TransitiveClosure,
                                  TheExpression
                             where TheExpression.src = TransitiveClosure.tgt)
  select src as src, tgt as tgt from TransitiveClosure where src = '_SRCATOM'
  ;

The brackets should be placed at the subquery that is the right hand side of the Union, like this:

  select SrcA as src, TgtA as tgt
  from r
  where (SrcA = '_SRCATOM')
        and ((SrcA is not null)
             and (TgtA is not null))
  union distinct
  ( with recursive TheExpression as (/* EDcD r[A*A] */
                                   /*    Expression: r [A*A] */
                                   /*    Signature : [A*A] */
                                   select SrcA as src, TgtA as tgt
                                   from r
                                   where (SrcA is not null)
                                         and (TgtA is not null)),
       TransitiveClosure as (select src as src, tgt as tgt from TheExpression
                             union distinct
                             select TransitiveClosure.src as src,
                                    TheExpression.tgt as tgt
                             from TransitiveClosure,
                                  TheExpression
                             where TheExpression.src = TransitiveClosure.tgt)
  select src as src, tgt as tgt from TransitiveClosure where src = '_SRCATOM'
  );
JakeWheat commented 2 days ago

Thanks for the report. I've done a little bit of looking into this.

The following two statements are valid in Postgres (I think at least the first one is ANSI SQL as well):

select * from t except (select * from u except select * from v);

(select * from t);

Neither of these currently parse with this library.

I propose to add an explicit QueryExprParens or similar to query expressions. Then you would have to use this to get parentheses around a nested CTE in your case. This sort of follows the approach with ScalarExprs and TableRefs. The alternative would be to have the parentheses implicit when the are mandatory for a CTE, and implicit when pretty printing this also.

Edit: as is usual, I thought of something to check only after posting this comment: the TRQueryExpr - which also has to have parentheses around it, has implicit parens in the syntax and pretty printing, so to be consistent with this would mean making the parentheses implicit in CTEs inside QueryExprSetOp also. So now I'm leaning towards that option (with the explicit QueryExprParens for the other cases).

hanjoosten commented 2 days ago

Hi Jake, Thanks for looking in to this. I am happy with the idea of adding an explicit QueryExprParens constructor for cases where it isn't obvious that parentheses are required. However, in the current cases where implicit parens are used, I wouldn't make them explicit too. That would likely cause breakage of applications that currently use your nice library to generate SQL.