apache / superset

Apache Superset is a Data Visualization and Data Exploration Platform
https://superset.apache.org/
Apache License 2.0
63.13k stars 13.99k forks source link

Cant save Dataset with params #29996

Closed kalimalrazif closed 3 months ago

kalimalrazif commented 3 months ago

Bug description

Can't save a Druid query into a Dataset when the query have params.

How to reproduce the bug

Enable SQL templating

Create a query with params:

SELECT *
FROM "sources_v3"
WHERE "interface_tag_string" like {{ tag }}

Run the query works perfectly fine

Try to save a dataset

Screenshots/recordings

The data set never saves, keeps trying to save Save dataset

Superset version

4.0.2

Python version

Not applicable

Node version

Not applicable

Browser

Chrome

Additional context

This is the log output:

superset_app          | 2024-08-22 18:13:43,240:DEBUG:urllib3.connectionpool:http://192.168.122.91:8888 "POST /druid/v2/sql HTTP/1.1" 400 7962
superset_app          | 2024-08-22 18:13:43,242:ERROR:superset.commands.dataset.refresh:'errorClass'
superset_app          | Traceback (most recent call last):
superset_app          |   File "/app/superset/connectors/sqla/utils.py", line 147, in get_columns_description
superset_app          |     cursor.execute(query)
superset_app          |   File "/usr/local/lib/python3.10/site-packages/pydruid/db/api.py", line 62, in g
superset_app          |     return f(self, *args, **kwargs)
superset_app          |   File "/usr/local/lib/python3.10/site-packages/pydruid/db/api.py", line 256, in execute
superset_app          |     first_row = next(results)
superset_app          |   File "/usr/local/lib/python3.10/site-packages/pydruid/db/api.py", line 362, in _stream_query
superset_app          |     msg = "{error} ({errorClass}): {errorMessage}".format(**payload)
superset_app          | KeyError: 'errorClass'
superset_app          | 
superset_app          | The above exception was the direct cause of the following exception:
superset_app          | 
superset_app          | Traceback (most recent call last):
superset_app          |   File "/app/superset/commands/dataset/refresh.py", line 45, in run
superset_app          |     self._model.fetch_metadata()
superset_app          |   File "/app/superset/connectors/sqla/models.py", line 1828, in fetch_metadata
superset_app          |     new_columns = self.external_metadata()
superset_app          |   File "/app/superset/connectors/sqla/models.py", line 1320, in external_metadata
superset_app          |     return get_virtual_table_metadata(dataset=self)
superset_app          |   File "/app/superset/connectors/sqla/utils.py", line 132, in get_virtual_table_metadata
superset_app          |     return get_columns_description(dataset.database, dataset.schema, statements[0])
superset_app          |   File "/app/superset/connectors/sqla/utils.py", line 153, in get_columns_description
superset_app          |     raise SupersetGenericDBErrorException(message=str(ex)) from ex
superset_app          | superset.exceptions.SupersetGenericDBErrorException: 'errorClass'
superset_app          | 2024-08-22 18:13:43,245:DEBUG:superset.stats_logger:[stats_logger] (incr) DatasetRestApi.put.error
superset_app          | 2024-08-22 18:13:43,245:WARNING:superset.views.base:CommandException
superset_app          | Traceback (most recent call last):
superset_app          |   File "/app/superset/connectors/sqla/utils.py", line 147, in get_columns_description
superset_app          |     cursor.execute(query)
superset_app          |   File "/usr/local/lib/python3.10/site-packages/pydruid/db/api.py", line 62, in g
superset_app          |     return f(self, *args, **kwargs)
superset_app          |   File "/usr/local/lib/python3.10/site-packages/pydruid/db/api.py", line 256, in execute
superset_app          |     first_row = next(results)
superset_app          |   File "/usr/local/lib/python3.10/site-packages/pydruid/db/api.py", line 362, in _stream_query
superset_app          |     msg = "{error} ({errorClass}): {errorMessage}".format(**payload)
superset_app          | KeyError: 'errorClass'
superset_app          | 
superset_app          | The above exception was the direct cause of the following exception:
superset_app          | 
superset_app          | Traceback (most recent call last):
superset_app          |   File "/app/superset/commands/dataset/refresh.py", line 45, in run
superset_app          |     self._model.fetch_metadata()
superset_app          |   File "/app/superset/connectors/sqla/models.py", line 1828, in fetch_metadata
superset_app          |     new_columns = self.external_metadata()
superset_app          |   File "/app/superset/connectors/sqla/models.py", line 1320, in external_metadata
superset_app          |     return get_virtual_table_metadata(dataset=self)
superset_app          |   File "/app/superset/connectors/sqla/utils.py", line 132, in get_virtual_table_metadata
superset_app          |     return get_columns_description(dataset.database, dataset.schema, statements[0])
superset_app          |   File "/app/superset/connectors/sqla/utils.py", line 153, in get_columns_description
superset_app          |     raise SupersetGenericDBErrorException(message=str(ex)) from ex
superset_app          | superset.exceptions.SupersetGenericDBErrorException: 'errorClass'
superset_app          | 
superset_app          | The above exception was the direct cause of the following exception:
superset_app          | 
superset_app          | Traceback (most recent call last):
superset_app          |   File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 1823, in full_dispatch_request
superset_app          |     rv = self.dispatch_request()
superset_app          |   File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 1799, in dispatch_request
superset_app          |     return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
superset_app          |   File "/usr/local/lib/python3.10/site-packages/flask_appbuilder/security/decorators.py", line 95, in wraps
superset_app          |     return f(self, *args, **kwargs)
superset_app          |   File "/app/superset/views/base_api.py", line 127, in wraps
superset_app          |     raise ex
superset_app          |   File "/app/superset/views/base_api.py", line 121, in wraps
superset_app          |     duration, response = time_function(f, self, *args, **kwargs)
superset_app          |   File "/app/superset/utils/core.py", line 1470, in time_function
superset_app          |     response = func(*args, **kwargs)
superset_app          |   File "/app/superset/utils/log.py", line 255, in wrapper
superset_app          |     value = f(*args, **kwargs)
superset_app          |   File "/app/superset/views/base_api.py", line 93, in wraps
superset_app          |     return f(self, *args, **kwargs)
superset_app          |   File "/app/superset/datasets/api.py", line 409, in put
superset_app          |     RefreshDatasetCommand(pk).run()
superset_app          |   File "/app/superset/commands/dataset/refresh.py", line 49, in run
superset_app          |     raise DatasetRefreshFailedError() from ex
superset_app          | superset.commands.dataset.exceptions.DatasetRefreshFailedError: Dataset could not be updated.
superset_app          | 192.168.122.158 - - [22/Aug/2024:18:13:43 +0000] "PUT /api/v1/dataset/39?override_columns=true HTTP/1.1" 500 241 "http://192.168.122.91:8088/sqllab" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Safari/537.36 Edg/127.0.0.0"

Checklist

dosubot[bot] commented 3 months ago

The issue you're encountering seems to be related to the handling of SQL templating parameters when saving a dataset. The error trace indicates a problem in the _stream_query method in the pydruid/db/api.py file, specifically a KeyError for the errorClass key in the payload.

Here are a few steps to address this issue:

  1. Check SQL Templating Configuration: Ensure that SQL templating is correctly configured in your superset_config.py file. The ENABLE_TEMPLATE_PROCESSING flag should be set to True.

  2. Review the Query: Make sure the query with the parameter is correctly formatted. For example:

    SELECT *
    FROM "sources_v3"
    WHERE "interface_tag_string" like '{{ tag }}'
  3. Error Handling in _stream_query: The error seems to be due to the absence of the errorClass key in the payload. You might need to handle this case in the _stream_query method. Here is a potential fix:

    try:
        msg = "{error} ({errorClass}): {errorMessage}".format(**payload)
    except KeyError:
        msg = "{error}: {errorMessage}".format(**payload)
  4. Debugging the get_columns_description Function: The get_columns_description function in superset/connectors/sqla/utils.py is where the error is initially raised. Ensure that the query is correctly executed and the cursor fetches the data properly.

  5. Dataset Refresh Command: The RefreshDatasetCommand class in superset/commands/dataset/refresh.py is responsible for refreshing the dataset metadata. Ensure that the dataset exists and the user has the necessary permissions.

By addressing these points, you should be able to resolve the issue with saving the dataset containing SQL templating parameters. If the problem persists, further debugging of the specific methods and error handling in the code might be necessary.

To continue talking to Dosu, mention @dosu.

kalimalrazif commented 3 months ago

If I run the query with '{{ tag }}' the query does not run.

kalimalrazif commented 3 months ago

The query runs with {{ tag }} without quotes

rusackas commented 3 months ago

This seems like a duplicate of https://github.com/apache/superset/issues/28600 - let me know if you agree.

rusackas commented 3 months ago

Actually, I'll close this as a duplicate of https://github.com/apache/superset/issues/25786... let's take the conversation (and hopefully a solution!) there.

kalimalrazif commented 3 months ago

Totally agree :-) Thanks :-)