apache / superset

Apache Superset is a Data Visualization and Data Exploration Platform
https://superset.apache.org/
Apache License 2.0
60.36k stars 13.02k forks source link

Trino SQL routines with multiple `;` characters cannot run on Superset SQL Lab Editor #26162

Open kevinjqliu opened 7 months ago

kevinjqliu commented 7 months ago

Superset SQL Lab Editor cannot run Trino SQL routines which have multiple ; statements.

How to reproduce the bug

  1. Go to 'SQL Lab' (http://localhost:8088/superset/sqllab)
  2. Run a Trino SQL Routine with multiple ; characters, such as this one from example docs
    WITH FUNCTION simple_case(a bigint)
    RETURNS varchar
    BEGIN
    CASE a
    WHEN 0 THEN RETURN 'zero';
    WHEN 1 THEN RETURN 'one';
    WHEN 10 THEN RETURN 'ten';
    WHEN 20 THEN RETURN 'twenty';
    ELSE RETURN 'other';
    END CASE;
    RETURN NULL;
    END
    SELECT simple_case(0);
  3. Sees error emssage

Expected results

SQL Routine to run successfully and return 0

Actual results

Website error message:

Trino Error
trino error: line 5:30: mismatched input '<EOF>'. Expecting: '%', '*', '+', '-', '.', '/', ';', 'AT', '[', '||'

This may be triggered by:
Issue 1002 - The database returned an unexpected error. 

Superset Error message:

2023-12-01 12:52:33 Triggering query_id: 41
2023-12-01 12:52:33 2023-12-01 20:52:33,077:INFO:superset.sqllab.commands.execute:Triggering query_id: 41
2023-12-01 12:52:33 SQLite Database support for metadata databases will be removed             in a future version of Superset.
2023-12-01 12:52:33 2023-12-01 20:52:33,098:WARNING:superset.utils.celery:SQLite Database support for metadata databases will be removed             in a future version of Superset.
2023-12-01 12:52:33 Query 41: Executing 8 statement(s)
2023-12-01 12:52:33 2023-12-01 20:52:33,113:INFO:superset.sql_lab:Query 41: Executing 8 statement(s)
2023-12-01 12:52:33 Query 41: Set query to 'running'
2023-12-01 12:52:33 2023-12-01 20:52:33,113:INFO:superset.sql_lab:Query 41: Set query to 'running'
2023-12-01 12:52:33 Query 41: Running statement 1 out of 8
2023-12-01 12:52:33 2023-12-01 20:52:33,126:INFO:superset.sql_lab:Query 41: Running statement 1 out of 8
2023-12-01 12:52:33 2023-12-01 20:52:33,152:DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): host.docker.internal:8080
2023-12-01 12:52:33 2023-12-01 20:52:33,163:DEBUG:urllib3.connectionpool:http://host.docker.internal:8080 "POST /v1/statement HTTP/1.1" 200 328
2023-12-01 12:52:33 2023-12-01 20:52:33,176:DEBUG:urllib3.connectionpool:http://host.docker.internal:8080 "GET /v1/statement/queued/20231201_205233_00062_fcdsq/y76761f7eb08c441e8c30f713ad964721e8eb9a87/1 HTTP/1.1" 200 1063
2023-12-01 12:52:33 SupersetErrorsException
2023-12-01 12:52:33 Traceback (most recent call last):
2023-12-01 12:52:33   File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1823, in full_dispatch_request
2023-12-01 12:52:33     rv = self.dispatch_request()
2023-12-01 12:52:33   File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1799, in dispatch_request
2023-12-01 12:52:33     return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
2023-12-01 12:52:33   File "/usr/local/lib/python3.9/site-packages/flask_appbuilder/security/decorators.py", line 95, in wraps
2023-12-01 12:52:33     return f(self, *args, **kwargs)
2023-12-01 12:52:33   File "/app/superset/views/base_api.py", line 127, in wraps
2023-12-01 12:52:33     raise ex
2023-12-01 12:52:33   File "/app/superset/views/base_api.py", line 121, in wraps
2023-12-01 12:52:33     duration, response = time_function(f, self, *args, **kwargs)
2023-12-01 12:52:33   File "/app/superset/utils/core.py", line 1526, in time_function
2023-12-01 12:52:33     response = func(*args, **kwargs)
2023-12-01 12:52:33   File "/app/superset/views/base_api.py", line 93, in wraps
2023-12-01 12:52:33     return f(self, *args, **kwargs)
2023-12-01 12:52:33   File "/app/superset/utils/log.py", line 255, in wrapper
2023-12-01 12:52:33     value = f(*args, **kwargs)
2023-12-01 12:52:33   File "/app/superset/sqllab/api.py", line 310, in execute_sql_query
2023-12-01 12:52:33     command_result: CommandResult = command.run()
2023-12-01 12:52:33   File "/app/superset/sqllab/commands/execute.py", line 121, in run
2023-12-01 12:52:33     raise ex
2023-12-01 12:52:33   File "/app/superset/sqllab/commands/execute.py", line 103, in run
2023-12-01 12:52:33     status = self._run_sql_json_exec_from_scratch()
2023-12-01 12:52:33   File "/app/superset/sqllab/commands/execute.py", line 161, in _run_sql_json_exec_from_scratch
2023-12-01 12:52:33     raise ex
2023-12-01 12:52:33   File "/app/superset/sqllab/commands/execute.py", line 156, in _run_sql_json_exec_from_scratch
2023-12-01 12:52:33     return self._sql_json_executor.execute(
2023-12-01 12:52:33   File "/app/superset/sqllab/sql_json_executer.py", line 111, in execute
2023-12-01 12:52:33     raise SupersetErrorsException(
2023-12-01 12:52:33 superset.exceptions.SupersetErrorsException: [SupersetError(message="trino error: line 5:30: mismatched input '<EOF>'. Expecting: '%', '*', '+', '-', '.', '/', ';', 'AT', '[', '||'", error_type=<SupersetErrorType.GENERIC_DB_ENGINE_ERROR: 'GENERIC_DB_ENGINE_ERROR'>, level=<ErrorLevel.ERROR: 'error'>, extra={'engine_name': 'Trino', 'issue_codes': [{'code': 1002, 'message': 'Issue 1002 - The database returned an unexpected error.'}]})]
2023-12-01 12:52:33 2023-12-01 20:52:33,191:WARNING:superset.views.base:SupersetErrorsException
2023-12-01 12:52:33 Traceback (most recent call last):
2023-12-01 12:52:33   File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1823, in full_dispatch_request
2023-12-01 12:52:33     rv = self.dispatch_request()
2023-12-01 12:52:33   File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1799, in dispatch_request
2023-12-01 12:52:33     return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
2023-12-01 12:52:33   File "/usr/local/lib/python3.9/site-packages/flask_appbuilder/security/decorators.py", line 95, in wraps
2023-12-01 12:52:33     return f(self, *args, **kwargs)
2023-12-01 12:52:33   File "/app/superset/views/base_api.py", line 127, in wraps
2023-12-01 12:52:33     raise ex
2023-12-01 12:52:33   File "/app/superset/views/base_api.py", line 121, in wraps
2023-12-01 12:52:33     duration, response = time_function(f, self, *args, **kwargs)
2023-12-01 12:52:33   File "/app/superset/utils/core.py", line 1526, in time_function
2023-12-01 12:52:33     response = func(*args, **kwargs)
2023-12-01 12:52:33   File "/app/superset/views/base_api.py", line 93, in wraps
2023-12-01 12:52:33     return f(self, *args, **kwargs)
2023-12-01 12:52:33   File "/app/superset/utils/log.py", line 255, in wrapper
2023-12-01 12:52:33     value = f(*args, **kwargs)
2023-12-01 12:52:33   File "/app/superset/sqllab/api.py", line 310, in execute_sql_query
2023-12-01 12:52:33     command_result: CommandResult = command.run()
2023-12-01 12:52:33   File "/app/superset/sqllab/commands/execute.py", line 121, in run
2023-12-01 12:52:33     raise ex
2023-12-01 12:52:33   File "/app/superset/sqllab/commands/execute.py", line 103, in run
2023-12-01 12:52:33     status = self._run_sql_json_exec_from_scratch()
2023-12-01 12:52:33   File "/app/superset/sqllab/commands/execute.py", line 161, in _run_sql_json_exec_from_scratch
2023-12-01 12:52:33     raise ex
2023-12-01 12:52:33   File "/app/superset/sqllab/commands/execute.py", line 156, in _run_sql_json_exec_from_scratch
2023-12-01 12:52:33     return self._sql_json_executor.execute(
2023-12-01 12:52:33   File "/app/superset/sqllab/sql_json_executer.py", line 111, in execute
2023-12-01 12:52:33     raise SupersetErrorsException(
2023-12-01 12:52:33 superset.exceptions.SupersetErrorsException: [SupersetError(message="trino error: line 5:30: mismatched input '<EOF>'. Expecting: '%', '*', '+', '-', '.', '/', ';', 'AT', '[', '||'", error_type=<SupersetErrorType.GENERIC_DB_ENGINE_ERROR: 'GENERIC_DB_ENGINE_ERROR'>, level=<ErrorLevel.ERROR: 'error'>, extra={'engine_name': 'Trino', 'issue_codes': [{'code': 1002, 'message': 'Issue 1002 - The database returned an unexpected error.'}]})]
2023-12-01 12:52:33 192.168.65.1 - - [01/Dec/2023:20:52:33 +0000] "POST /api/v1/sqllab/execute/ HTTP/1.1" 500 337 "http://localhost:8088/superset/sqllab" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36"
2023-12-01 12:52:33 192.168.65.1 - - [01/Dec/2023:20:52:33 +0000] "PUT /tabstateview/4 HTTP/1.1" 200 1 "http://localhost:8088/superset/sqllab" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36"

Screenshots

Screenshot 2023-12-01 at 12 51 39 PM

Environment

(please complete the following information):

Checklist

Make sure to follow these steps before submitting your issue - thank you!

Additional context

I was able to track this down to the SQL statement parsing logic. Specifically, the db_engine_spec.run_multiple_statements_as_one variable is used to determine whether to parse SQL with ; characters into multiple SQL statements. The TrinoEngineSpec and PrestoBaseEngineSpec both do not include this variable. So run_multiple_statements_as_one is set to False by the BaseEngineSpec.

I was able to set run_multiple_statements_as_one to True in TrinoEngineSpec and got the SQL above to work. However, this changes the default behavior of SQL Lab for Trino which allows running multiple SQL statements separated by ;.

mosabua commented 7 months ago

Thank you for filing this @kevinjqliu !

DBeaver has a similar problem. I expect multiple clients will fall over this problem since they assume ; to be the end of a statement..

For functions however it would be the END or the function for inline function (see your example

or for CREATE FUNCTION usage the END;

If anyone needs help with this logic.. ping me

https://github.com/dbeaver/dbeaver/issues/21808

mosabua commented 6 months ago

Btw here is the logic to figure out the details from the Trino CLI https://github.com/trinodb/trino/blob/master/client/trino-cli/src/main/java/io/trino/cli/lexer/StatementSplitter.java

rusackas commented 2 months ago

@betodealmeida do you think the sqlglot migration would help at all with this sort of issue?

kevinjqliu commented 2 months ago

Seems like sqlglot can help solve this issue.

It doesn't work out of the box though.

from sqlglot import parse

sql = """FUNCTION simple_case(a bigint)
  RETURNS varchar
  BEGIN
    CASE a
      WHEN 0 THEN RETURN 'zero';
      WHEN 1 THEN RETURN 'one';
      ELSE RETURN 'more than one or negative';
    END CASE;
    RETURN NULL;
  END
""" 

parse(sql, dialect='trino')

Returns an error

sqlglot.errors.ParseError: Invalid expression / Unexpected token. Line 1, Col: 21.
  FUNCTION simple_case(a bigint)
  RETURNS varchar
  BEGIN
    CASE a
      WHEN 0 THEN RETURN 'zero';
      WHEN 1 THEN RE
betodealmeida commented 2 months ago

Even if this is not supported by sqlglot it should work in Superset as is — at least if DML is enabled. Let me take a look.