oguimbal / pgsql-ast-parser

Yet another simple Postgres SQL parser
296 stars 41 forks source link

`INTERSECT` statements fail to parse #117

Open icacho opened 1 year ago

icacho commented 1 year ago

Using

import { parse, Statement } from 'pgsql-ast-parser';

const ast: Statement[] = parse(`
BEGIN TRANSACTION;
CREATE TABLE "foo" (
    "biz_name" character varying NOT NULL
);

CREATE TABLE "bar" (
    "biz_name" character varying NOT NULL
);

SELECT biz_name 
    FROM foo
INTERSECT 
SELECT biz_name 
    FROM bar;
`);

Fails with

Error: Syntax error at line 13 col 1:

  INTERSECT
  ^
Unexpected kw_intersect token: "intersect". Instead, I was expecting to see one of the following:

    - A "dot" token
    - A "dot" token
    - A "lparen" token
    - A "kw_as" token
    - A "kw_primary" token
    - A "kw_unique" token
    - A "quoted_word" token
    - A "word" token
    - A "kw_cross" token
    - A "kw_left" token
    - A "kw_right" token
    - A "kw_full" token
    - A "comma" token
    - A "kw_inner" token
    - A "kw_where" token
    - A "kw_join" token
    - A "kw_group" token
    - A "kw_order" token
    - A "kw_for" token
    - A "kw_offset" token
    - A "kw_limit" token
    - A "kw_fetch" token
    - A "kw_union" token
    - A "semicolon" token

    at Parser.feed (node_modules/nearley/lib/nearley.js:343:27)
    at _parse (node_modules/pgsql-ast-parser/index.js:1909:16)
    at doParse (node_modules/pgsql-ast-parser/index.js:1864:27)
    at Object.parse (node_modules/pgsql-ast-parser/index.js:1867:11)
    at Object.<anonymous> (src/test-syntax/index.js:4:30)
    at Module._compile (node:internal/modules/cjs/loader:1103:14)
    at Object.Module._extensions..js (node:internal/modules/cjs/loader:1157:10)
    at Module.load (node:internal/modules/cjs/loader:981:32)
    at Function.Module._load (node:internal/modules/cjs/loader:822:12)
    at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:77:12) {
  offset: 29,
  token: {
    type: 'kw_intersect',
    value: 'intersect',
    text: 'INTERSECT',
    toString: [Function: tokenToString],
    offset: 184,
    lineBreaks: 0,
    line: 13,
    col: 1
  }
}

Expected

image

https://onecompiler.com/postgresql/3ygqmf7we

nene commented 1 year ago

Stumbled upon this problem as well and discovered several other set-operations related problems.

INTERSECT & EXCEPT operators are not supported

FYI: INTERSECT should bind more strongly that UNION, while EXCEPT should bind at the same level as UNION.

Incorrect binding of UNION

Additionally discovered that UNION operator binding is incorrectly. For example parsing this SQL:

select * from foo UNION select * from bar UNION select * from baz

produces the following AST:

[{
  "type": "union",
  "left": {
    "type": "select",
    "columns": [ { "expr": { "type": "ref", "name": "*" } } ],
    "from": [ { "type": "table", "name": { "name": "foo" } } ]
  },
  "right": {
    "type": "union",
    "left": {
      "type": "select",
      "columns": [ { "expr": { "type": "ref", "name": "*" } } ],
      "from": [ { "type": "table", "name": { "name": "bar" } } ]
    },
    "right": {
      "type": "select",
      "columns": [ { "expr": { "type": "ref", "name": "*" } } ],
      "from": [ { "type": "table", "name": { "name": "baz" } } ]
    }
  }
}]

It should bind from left to right, treating the SQL as if it were parenthesized like so:

(select * from foo UNION select * from bar) UNION select * from baz

But instead it treats the SQL as if it were parenthesized like so:

select * from foo UNION (select * from bar UNION select * from baz)

ORDER BY and LIMIT are not allowed at the end of union

Parsing the following valid PostgreSQL code throws an error:

(select * from foo) UNION (select * from bar) ORDER BY name LIMIT 5

WITH clause before fully parenthesized union is not allowed

The following code parses just fine:

WITH p AS (select * from persons) (select * from p) UNION (select * from p)

But the following throws parse error:

WITH p AS (select * from persons) (select * from p UNION select * from p)

Though the root problem here seems to be that a parenthesized SELECT statement is not considered valid. This throws error:

(select * from p)