Closed remingtonc closed 3 days ago
Hi, unfortunately the plpgsql parser is not yet fully implemented, so I'm afraid it is of little utility as of now. I'm enjoying a short vacation so I cannot try your example: I will do once I get back.
As the related issue is now closed, I will try to get at this soon.
The parse error is now fixed in just released v3.6, but I'm leaving this open to remind me a possible clarification in the doc about other issues:
PLpgSQL
is still vaporware@lelit I am confused here as well, when to use either of the two. Can you please clarify.
I will try to clarify this in the documentation.
In the meantime, expanding what I said in a previous comment above, the parse_plpgsql
function is severely underpowered because while it properly execute the parsing of the statement as PostgreSQL would, what it returns is a a little more of a raw sequence of tokens, not an AST
like parse_sql
does.
Consider the following examples: they parse the same
from pprint import pprint
from pglast import parse_plpgsql
STMT = """\
CREATE FUNCTION add (a integer, b integer)
RETURNS integer AS $$
BEGIN
RETURN a + b;
END;
$$ LANGUAGE plpgsql
"""
as_plpgsql = parse_sql(STMT)
pprint(as_plpgsql)
This prints out
[{'PLpgSQL_function': {'action': {'PLpgSQL_stmt_block': {'body': [{'PLpgSQL_stmt_return': {'expr': {'PLpgSQL_expr': {'parseMode': 2,
'query': 'a '
'+ '
'b'}},
'lineno': 1}}],
'lineno': 1}},
'datums': [{'PLpgSQL_var': {'datatype': {'PLpgSQL_type': {'typname': 'UNKNOWN'}},
'refname': 'a'}},
{'PLpgSQL_var': {'datatype': {'PLpgSQL_type': {'typname': 'UNKNOWN'}},
'refname': 'b'}},
{'PLpgSQL_var': {'datatype': {'PLpgSQL_type': {'typname': 'UNKNOWN'}},
'refname': 'found'}}]}}]
that, as you can see, is just a list of plain Python dictionaries.
If you use parse_sql
instead, you obtain a richer representation of the statement:
from pprint import pprint
from pglast import parse_sql
STMT = """\
CREATE FUNCTION add (a integer, b integer)
RETURNS integer AS $$
BEGIN
RETURN a + b;
END;
$$ LANGUAGE plpgsql
"""
as_sql = parse_sql(STMT)
pprint([stmt(skip_none=True) for stmt in as_sql])
This emits
[{'@': 'RawStmt',
'stmt': {'@': 'CreateFunctionStmt',
'funcname': ({'@': 'String', 'sval': 'add'},),
'is_procedure': False,
'options': ({'@': 'DefElem',
'arg': ({'@': 'String',
'sval': '\nBEGIN \n RETURN a + b;\nEND;\n'},),
'defaction': {'#': 'DefElemAction',
'name': 'DEFELEM_UNSPEC',
'value': 0},
'defname': 'as',
'location': 59},
{'@': 'DefElem',
'arg': {'@': 'String', 'sval': 'plpgsql'},
'defaction': {'#': 'DefElemAction',
'name': 'DEFELEM_UNSPEC',
'value': 0},
'defname': 'language',
'location': 96}),
'parameters': ({'@': 'FunctionParameter',
'argType': {'@': 'TypeName',
'location': 23,
'names': ({'@': 'String',
'sval': 'pg_catalog'},
{'@': 'String',
'sval': 'int4'}),
'pct_type': False,
'setof': False,
'typemod': -1},
'mode': {'#': 'FunctionParameterMode',
'name': 'FUNC_PARAM_DEFAULT',
'value': 'd'},
'name': 'a'},
{'@': 'FunctionParameter',
'argType': {'@': 'TypeName',
'location': 34,
'names': ({'@': 'String',
'sval': 'pg_catalog'},
{'@': 'String',
'sval': 'int4'}),
'pct_type': False,
'setof': False,
'typemod': -1},
'mode': {'#': 'FunctionParameterMode',
'name': 'FUNC_PARAM_DEFAULT',
'value': 'd'},
'name': 'b'}),
'replace': False,
'returnType': {'@': 'TypeName',
'location': 51,
'names': ({'@': 'String', 'sval': 'pg_catalog'},
{'@': 'String', 'sval': 'int4'}),
'pct_type': False,
'setof': False,
'typemod': -1}},
'stmt_len': 0,
'stmt_location': 0}]
The parse_plpgsql
function is meant to parse more complex procedure language statements, containing PG extensions to the SQL language like loops, conditions and the like. But for now, it is of little utility, until someone implement a proper AST for it, either at the lower libpg_query
level, or in pglast
.
This should be fixed by referenced commit, in upcoming v6.5.
Hello! Nice projection! Trying to determine if this library could be utilized to parse many SQL statements to form an AST of a schema - or at least have the per-statement AST parsed in useful ways to formulate things about the schema. I am uncertain of the difference between the
parse_sql
andparse_plpgsql
functions and how to utilize them correctly. An example...Using
parse_plpgsql
fails:Using
parse_sql
here works, but yields a big blob unparsed:Trying to parse that inner blob as plpgsql similarly does not work:
Any tips?
Further - any heuristic to not need to explicitly declare "this function should be parsed as SQL" versus "this function should be parsed as PL/pgSQL", or do I need to decide that per statement?