iafisher / sqliteparser

A parser for SQLite's dialect of SQL
https://sqliteparser.readthedocs.io/en/latest/
MIT License
1 stars 1 forks source link

verbatim option for expressions in generated columns and check constraints #10

Open luxint opened 2 years ago

luxint commented 2 years ago

Because sqlite allows for user definable functions at runtime you need to be able to return expressions 'verbatim' without parsing them as an ast. Also not all syntax constructs are parsed at the moment (for instance CASe expr WHEN exp ...) it is helpfull if the parser returns expressions in generated colums and check constraints 'as is'.

After some trial and error ;-) i came up with this in parser.py,

` @debuggable def match_expression(self, precedence: int = -1, verbatim = False) -> ast.Expression:

    if verbatim:
        level = 0
        start_index = self.lexer.index - len(self.lexer.current_token.value)
        literal_expression = None
        while True:

            if self.lexer.current_token.type == TokenType.RIGHT_PARENTHESIS:
                if level == 0:    
                    literal_expression = ast.String(self.lexer.program[start_index:self.lexer.index-1])        
                    break

                else:
                    level -= 1
                    if level < 0:
                        raise SQLiteParserError('unbalanced parenthesis')

            elif self.lexer.current_token.type == TokenType.LEFT_PARENTHESIS:
                level += 1            

            self.lexer.advance()
            if self.lexer.done():
                raise SQLiteParserError('unbalanced parenthesis')            

        return literal_expression    

    `
iafisher commented 2 years ago

Could you give an example of a user-definable function?

luxint commented 2 years ago

Well for instance I'm planning to use parse as a user definable function in sqlite! It would return the ast of the sql create statement in json. You can get a list of all the functions with :

PRAGMA function_list;

This pragma returns a list of SQL functions known to the database connection. Each row of the result describes a single calling signature for a single SQL function. Some SQL functions will have multiple rows in the result set if they can (for example) be invoked with a varying number of arguments or can accept text in various encoding

So another parameter you could pass to parse is this list to recognise functions.

Besides this are you planning to augment expression evaluation with all the possibilities? This is quite a hairy beast since also for instance fun blown select statements are allowed in this (but not with all the possibilities in expressions after default...)

I'm fine with the return of the literal expression, which actually has a bug in it as shown above, the calculation of the start_index is wrong....

iafisher commented 2 years ago

But the syntax for calling user-defined functions is the same, isn't it? Do you have a specific SQL snippet that sqliteparser can't parse?

It doesn't matter to sqliteparser where/how a function is defined, or even if it is defined at all. The library doesn't try to validate the semantic correctness of the SQL, it just parses the syntactic form.

luxint commented 2 years ago

Ah ok, yes the syntax is the same. What sqliteparser couldn't parse is a case expression in an expression like: CREATE TABLE "~usertenant" ( "user" TEXT NOT NULL, "tenant" TEXT NOT NULL, "is_admin" INT, "is_sys" INT DEFAULT 0, "allow" AS (CASE WHEN is_admin = 1 THEN 'a' ELSE '' END || CASE WHEN is_sys = 1 then 's'ELSE '' END) STORED);

iafisher commented 2 years ago

I filed #11 to track support for CASE expressions. I would rather just finish off the parser than add a verbatim option.

luxint commented 2 years ago

ok, i've got a pull request for verbatim ready if you're interested.

iafisher commented 2 years ago

Sorry, I must have missed the notification about your PR. I'd prefer simply supporting whatever additional syntax is needed rather than adding a verbatim option. Looking at the diagram at https://sqlite.org/lang_expr.html I don't think there's too much left that sqliteparser can't handle.

It looks like your PR also adds support for some more CREATE statements. I'd be happy to take a look at those changes if you're still interested in pursuing them.

luxint commented 2 years ago

Hi,

yeah i've added stubs for all the create.... statements

What i've implemented so far :

  1. create index.. statement
  2. unique - table level constraint
  3. strict - table option
  4. table level check constraints

What is you opinion about being more flexible with identifiers and string literals, sqlite documentation says this: _For resilience when confronted with historical SQL statements, SQLite will sometimes bend the quoting rules above:

If a keyword in single quotes (ex: 'key' or 'glob') is used in a context where an identifier is allowed but where a string literal is not allowed, then the token is understood to be an identifier instead of a string literal.

If a keyword in double quotes (ex: "key" or "glob") is used in a context where it cannot be resolved to an identifier but where a string literal is allowed, then the token is understood to be a string literal instead of an identifier._

iafisher commented 2 years ago

That's great. Re: identifiers/string literals, I think it would be good for sqliteparser to match SQLite's behavior, but I wouldn't say it's high priority.