auxten / postgresql-parser

Pure Golang PostgreSQL (SQL:2011, SQL:2008, SQL:2003, SQL:1999, and SQL-92 Standard) Parser
Apache License 2.0
273 stars 50 forks source link
cockroachdb golang postgresql sql sql-parser sql2011

What's this

PostgreSQL style Parser splitted from CockroachDB

See: Complex SQL format example

I tried to import github.com/cockroachdb/cockroach/pkg/sql/parser, but the dependencies is too complex to make it work.

To make things easy, I did these things:

  1. Copy all the pkg/sql/parser, pkg/sql/lex and simplify the dependencies
  2. Simplify the Makefile to just generate the goyacc stuff
  3. Add the goyacc generated files in parser and lex to make go get work easily, see the .gitignore files
  4. Trim the etcd dependency, see the go.mod
  5. Rename all test file except some pkg/sql/parser tests
  6. Add all necessary imports to vendor
  7. Remove the panic of meeting unregistried functions, see the WrapFunction
  8. Other nasty things make the parser just work that I forgot :p

Who is using this

Features

SQL Standard Compliance

The code is derived from CockroachDB v20.1.11 which supports most of the major features of SQL:2011. See:

How to use

package main

import (
    "log"

    "github.com/auxten/postgresql-parser/pkg/sql/parser"
    "github.com/auxten/postgresql-parser/pkg/walk"
)

func main() {
    sql := `select marr
            from (select marr_stat_cd AS marr, label AS l
                  from root_loan_mock_v4
                  order by root_loan_mock_v4.age desc, l desc
                  limit 5) as v4
            LIMIT 1;`
    w := &walk.AstWalker{
        Fn: func(ctx interface{}, node interface{}) (stop bool) {
            log.Printf("node type %T", node)
            return false
        },
    }

    stmts, err := parser.Parse(sql)
    if err != nil {
        return
    }

    _, _ = w.Walk(stmts, nil)
    return
}

SQL parser

This project contains code that is automatically generated using goyacc. goyacc reads the SQL expressions (sql.y) and generates a parser which could be used to tokenize a given input. You could update the generated code using the generate target inside the project's Makefile.

$ make generate

Progress

Todo