uber / queryparser

Parsing and analysis of Vertica, Hive, and Presto SQL.
MIT License
1.08k stars 147 forks source link

Fails to parse "CREATE VIEW" statements with "UNION" clause #48

Closed lukas-mi closed 5 years ago

lukas-mi commented 5 years ago

I've been trying out queryparser and stumbled upon an issue - parsing CREATE VIEW statements which has any kind of UNION clause causes parsing to fail. If CREATE VIEW is replaced by CREATE TABLE, parsing succeeds.

Queries:

CREATE SCHEMA test_schema;

CREATE TABLE test_schema.test_table_0 (a0 INT, b0 VARCHAR(10));
CREATE TABLE test_schema.test_table_1 (a1 INT, b1 VARCHAR(10));

CREATE VIEW test_schema.test_view_union AS
    SELECT
        a0 as a,
        b0 as b
    FROM test_schema.test_table_0
UNION
    SELECT
        a1 as a,
        b1 as b
    FROM test_schema.test_table_1;

Code used for parsing:

module Parsing
  ( parse
  , parseUnsafe
  , parseFileUnsafe
  ) where

import            Database.Sql.Position
import            Database.Sql.Util.Scope (runResolverWError)
import            Database.Sql.Type
import            Database.Sql.Vertica.Type (VerticaStatement(..), Vertica)
import qualified  Database.Sql.Vertica.Parser as VP

import qualified  Text.Parsec as P

import            Data.Proxy
import qualified  Data.ByteString.Lazy as BS
import qualified  Data.Text.Lazy.Encoding as TE
import qualified  Data.Text.Lazy as T

parse :: T.Text -> Either P.ParseError [VerticaStatement RawNames Range]
parse = VP.parseManyAll

parseUnsafe :: T.Text -> [VerticaStatement RawNames Range]
parseUnsafe sql =
  case parse sql of
    Right stmts -> stmts
    Left err -> error $ show err

parseFileUnsafe :: String -> IO [VerticaStatement RawNames Range]
parseFileUnsafe path = parseUnsafe . TE.decodeUtf8 <$> BS.readFile path

Produced error:

*** Exception: "-" (line 11, column 0):
unexpected (TokWord False "union",Position {positionLine = 11, positionColumn = 0, positionOffset = 279},Position {positionLine = 11, positionColumn = 5, positionOffset = 284})
expecting end of input
CallStack (from HasCallStack):
  error, called at /home/../Parsing.hs:27:17 in main:Parsing
h4v0kh3l1 commented 5 years ago

Yep, this looks like a bug. Vertica docs look like they support UNION in CREATE VIEW statements (link), but the vertica parser captures the non-union SELECT component (link).

I'm curious though, could you do a WITH CTE in a CREATE VIEW in Vertica? That could turn this into a more involved bugfix.

dlthomas commented 5 years ago

@h4v0kh3l1 One option, arguably less clean but maybe easier, is to over-parse and then complain (in the parent parser) about the forbidden cases.

lukas-mi commented 5 years ago

@h4v0kh3l1 WITH clause didn't help either. Works just fine on Vertica though. Query:

CREATE VIEW test_schema.test_table_union AS
    (WITH
         with_clause AS (
             SELECT
                 a0 AS a,
                 b0 AS b
             FROM test_schema.test_table_0
             UNION
             SELECT
                 a1 AS a,
                 b1 AS b
             FROM test_schema.test_table_1)
    SELECT * FROM with_clause);

Error:

*** Exception: "-" (line 1, column 7):
unexpected TokWord False "view"
CallStack (from HasCallStack):
  error, called at /home/../Parsing.hs:27:17 in main:Parsing
lukas-mi commented 5 years ago

@h4v0kh3l1 I guess queryP can be reused since all Query constructors seem to be supported by Vertica in CREATE VIEW statements.