apache / superset

Apache Superset is a Data Visualization and Data Exploration Platform
https://superset.apache.org/
Apache License 2.0
63.07k stars 13.96k forks source link

Superset SQL parser does not recognize FETCH FIRST clause causing syntax error when appending LIMIT clause #27427

Open hab6 opened 8 months ago

hab6 commented 8 months ago

Bug description

In sql_parse.py, the Superset SQL parser is able to recognize and handle row limiting clauses that use keywords LIMIT and TOP but does not handle the FETCH FIRST clause, which is used by some databases, including Ingres and PostgreSQL. With Ingres, the FETCH FIRST clause is required if using the OFFSET keyword in a SQL SELECT statement.

If an [OFFSET] FETCH FIRST clause is used in a SQL statement, the Superset parser still appends a LIMIT clause, which makes the SQL syntactically invalid.

The problem can be worked around by setting SQL_MAX_ROW=0 in superset_config.py, but this requires restarting Superset and doesn't fit well if SQL_MAX_ROW needs to be set to a value greater than zero for other reasons.

I found the following two older issues that reported similar problems using DB2 and Teradata, but they don't seem to have addressed the issue that still exists when using a FETCH FIRST clause.

5063 and #9600

How to reproduce the bug

The problem can be reproduced using Superset SQL Lab with an Ingres or Postgres database by trying to execute an adhoc SQL statement such as SELECT * FROM myschema.people OFFSET 5 FETCH FIRST 3 ROWS ONLY in SQL Lab. Superset reports an error because the SQL parser appends a LIMIT clause, which makes the SQL syntactically invalid:

SELECT * FROM myschema.people OFFSET 5 FETCH FIRST 3 ROWS ONLY LIMIT 11
SupersetErrorsException
[42000] [Actian][Actian ODBC Driver][INGRES]line 2,
Syntax error on \'LIMIT\' at or near \' fetch first 3 rows only\\nLIMIT\' ...

Screenshots/recordings

No response

Superset version

master / latest-dev

Python version

3.10

Node version

I don't know

Browser

Not applicable

Additional context

Stack trace

2024-03-07 16:33:34,277:INFO:werkzeug:10.250.88.252 - - [07/Mar/2024 16:33:34] "PUT /tabstateview/1 HTTP/1.1" 200 -
Triggering query_id: 209
2024-03-07 16:33:35,185:INFO:superset.commands.sql_lab.execute:Triggering query_id: 209
SQLite Database support for metadata databases will be removed             in a future version of Superset.
2024-03-07 16:33:35,200:WARNING:superset.utils.celery:SQLite Database support for metadata databases will be removed             in a future version of Superset.
Query 209: Executing 1 statement(s)
2024-03-07 16:33:35,207:INFO:superset.sql_lab:Query 209: Executing 1 statement(s)
Query 209: Set query to 'running'
2024-03-07 16:33:35,207:INFO:superset.sql_lab:Query 209: Set query to 'running'
Query 209: Running statement 1 out of 1
2024-03-07 16:33:35,292:INFO:superset.sql_lab:Query 209: Running statement 1 out of 1
2024-03-07 16:33:35,311:DEBUG:root:Query 209: Running query: SELECT * from actian.people offset 5 fetch first 3 rows only
LIMIT 11
SupersetErrorsException
Traceback (most recent call last):
  File "/home/test13536/.venv/lib/python3.10/site-packages/flask/app.py", line 1484, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/test13536/.venv/lib/python3.10/site-packages/flask/app.py", line 1469, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "/home/test13536/.venv/lib/python3.10/site-packages/flask_appbuilder/security/decorators.py", line 95, in wraps
    return f(self, *args, **kwargs)
  File "/home/test13536/apache-superset-3.1.1/superset/views/base_api.py", line 127, in wraps
    raise ex
  File "/home/test13536/apache-superset-3.1.1/superset/views/base_api.py", line 121, in wraps
    duration, response = time_function(f, self, *args, **kwargs)
  File "/home/test13536/apache-superset-3.1.1/superset/utils/core.py", line 1463, in time_function
    response = func(*args, **kwargs)
  File "/home/test13536/apache-superset-3.1.1/superset/views/base_api.py", line 93, in wraps
    return f(self, *args, **kwargs)
  File "/home/test13536/apache-superset-3.1.1/superset/utils/log.py", line 255, in wrapper
    value = f(*args, **kwargs)
  File "/home/test13536/apache-superset-3.1.1/superset/sqllab/api.py", line 407, in execute_sql_query
    command_result: CommandResult = command.run()
  File "/home/test13536/apache-superset-3.1.1/superset/commands/sql_lab/execute.py", line 121, in run
    raise ex
  File "/home/test13536/apache-superset-3.1.1/superset/commands/sql_lab/execute.py", line 103, in run
    status = self._run_sql_json_exec_from_scratch()
  File "/home/test13536/apache-superset-3.1.1/superset/commands/sql_lab/execute.py", line 161, in _run_sql_json_exec_from_scratch
    raise ex
  File "/home/test13536/apache-superset-3.1.1/superset/commands/sql_lab/execute.py", line 156, in _run_sql_json_exec_from_scratch
    return self._sql_json_executor.execute(
  File "/home/test13536/apache-superset-3.1.1/superset/sqllab/sql_json_executer.py", line 111, in execute
    raise SupersetErrorsException(
superset.exceptions.SupersetErrorsException: [SupersetError(message='base error: (\'42000\', "[42000] [Actian][Actian ODBC Driver][INGRES]line 2, Syntax error on \'LIMIT\' at or near \' fetch first 3 rows only\\nLIMIT\'.  The correct syntax is: \\n SELECT [ALL|DISTINCT] target_list \\n  FROM table(s) \\n  [WHERE search_cond] \\n  [GROUP BY col(s)] \\n  [HAVING search_cond] \\n  [WINDOW window_defn] \\n [UNION subselect] \\n [ORDER BY col(s)] (2501) (SQLExecDirectW)")', error_type=<SupersetErrorType.GENERIC_DB_ENGINE_ERROR: 'GENERIC_DB_ENGINE_ERROR'>, level=<ErrorLevel.ERROR: 'error'>, extra={'engine_name': None, 'issue_codes': [{'code': 1002, 'message': 'Issue 1002 - The database returned an unexpected error.'}]})]
2024-03-07 16:33:35,329:WARNING:superset.views.base:SupersetErrorsException
Traceback (most recent call last):
  File "/home/test13536/.venv/lib/python3.10/site-packages/flask/app.py", line 1484, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/test13536/.venv/lib/python3.10/site-packages/flask/app.py", line 1469, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "/home/test13536/.venv/lib/python3.10/site-packages/flask_appbuilder/security/decorators.py", line 95, in wraps
    return f(self, *args, **kwargs)
  File "/home/test13536/apache-superset-3.1.1/superset/views/base_api.py", line 127, in wraps
    raise ex
  File "/home/test13536/apache-superset-3.1.1/superset/views/base_api.py", line 121, in wraps
    duration, response = time_function(f, self, *args, **kwargs)
  File "/home/test13536/apache-superset-3.1.1/superset/utils/core.py", line 1463, in time_function
    response = func(*args, **kwargs)
  File "/home/test13536/apache-superset-3.1.1/superset/views/base_api.py", line 93, in wraps
    return f(self, *args, **kwargs)
  File "/home/test13536/apache-superset-3.1.1/superset/utils/log.py", line 255, in wrapper
    value = f(*args, **kwargs)
  File "/home/test13536/apache-superset-3.1.1/superset/sqllab/api.py", line 407, in execute_sql_query
    command_result: CommandResult = command.run()
  File "/home/test13536/apache-superset-3.1.1/superset/commands/sql_lab/execute.py", line 121, in run
    raise ex
  File "/home/test13536/apache-superset-3.1.1/superset/commands/sql_lab/execute.py", line 103, in run
    status = self._run_sql_json_exec_from_scratch()
  File "/home/test13536/apache-superset-3.1.1/superset/commands/sql_lab/execute.py", line 161, in _run_sql_json_exec_from_scratch
    raise ex
  File "/home/test13536/apache-superset-3.1.1/superset/commands/sql_lab/execute.py", line 156, in _run_sql_json_exec_from_scratch
    return self._sql_json_executor.execute(
  File "/home/test13536/apache-superset-3.1.1/superset/sqllab/sql_json_executer.py", line 111, in execute
    raise SupersetErrorsException(
superset.exceptions.SupersetErrorsException: [SupersetError(message='base error: (\'42000\', "[42000] [Actian][Actian ODBC Driver][INGRES]line 2, Syntax error on \'LIMIT\' at or near \' fetch first 3 rows only\\nLIMIT\'.  The correct syntax is: \\n SELECT [ALL|DISTINCT] target_list \\n  FROM table(s) \\n  [WHERE search_cond] \\n  [GROUP BY col(s)] \\n  [HAVING search_cond] \\n  [WINDOW window_defn] \\n [UNION subselect] \\n [ORDER BY col(s)] (2501) (SQLExecDirectW)")', error_type=<SupersetErrorType.GENERIC_DB_ENGINE_ERROR: 'GENERIC_DB_ENGINE_ERROR'>, level=<ErrorLevel.ERROR: 'error'>, extra={'engine_name': None, 'issue_codes': [{'code': 1002, 'message': 'Issue 1002 - The database returned an unexpected error.'}]})]

Checklist

rusackas commented 4 months ago

I see a couple linked/merged PRs here. Is this still an issue? If so, maybe @betodealmeida can weigh in here, since this might be resolved by the SQLParse/SQLGlot migration.