apache / superset

Apache Superset is a Data Visualization and Data Exploration Platform
https://superset.apache.org/
Apache License 2.0
62.04k stars 13.61k forks source link

Queries from charts for async databases are being executed in the server instances, not the workers #22030

Open diogenes1oliveira opened 1 year ago

diogenes1oliveira commented 1 year ago

Charts for databases with Asynchronous query execution aren't being executed in the worker nodes as expected.

How to reproduce the bug

  1. Add a datasource in a setup with Async Queries configured via Celery;
  2. Enable the following options in Advanced > Other > Engine Parameters:
    {"pool_pre_ping":true,"echo_pool":"debug","echo":true}
  3. Enable the option Advanced > Performance > Asynchronous query execution;
  4. Add a sample chart in a dashboard.
  5. Check the logs of both the webserver and the worker instances;

Expected results

I expected the queries to be executed in the worker instance. For instance, when running through SQL Lab that's what happens. The unexpected behavior occurs when using dashboards.

Actual results

We see the connection being established in the webserver instance, not the worker (1.2.3.4 stands for the IP of my database with data to be charted, not the Superset database)

pod-app-hostname 2022-11-03 19:19:30,871 DEBUG sqlalchemy.pool.impl.NullPool Created new connection <_mysql.connection open to '1.2.3.4' at 0x55beb238bc50> pod-app-hostname DEBUG:sqlalchemy.pool.impl.NullPool:Created new connection <_mysql.connection open to '1.2.3.4' at 0x55beb238bc50> pod-app-hostname INFO:sqlalchemy.engine.base.Engine:SHOW VARIABLES LIKE 'sql_mode' pod-app-hostname 2022-11-03 19:19:30,873 INFO sqlalchemy.engine.base.Engine SHOW VARIABLES LIKE 'sql_mode' pod-app-hostname 2022-11-03 19:19:30,873 INFO sqlalchemy.engine.base.Engine () pod-app-hostname INFO:sqlalchemy.engine.base.Engine:() pod-app-hostname 2022-11-03 19:19:30,877 INFO sqlalchemy.engine.base.Engine SHOW VARIABLES LIKE 'lower_case_table_names' pod-app-hostname 2022-11-03 19:19:30,878 INFO sqlalchemy.engine.base.Engine () pod-app-hostname INFO:sqlalchemy.engine.base.Engine:SHOW VARIABLES LIKE 'lower_case_table_names' pod-app-hostname INFO:sqlalchemy.engine.base.Engine:() pod-app-hostname INFO:sqlalchemy.engine.base.Engine:SELECT DATABASE() pod-app-hostname 2022-11-03 19:19:30,881 INFO sqlalchemy.engine.base.Engine SELECT DATABASE() pod-app-hostname INFO:sqlalchemy.engine.base.Engine:() pod-app-hostname 2022-11-03 19:19:30,882 INFO sqlalchemy.engine.base.Engine () pod-app-hostname 2022-11-03 19:19:30,884 INFO sqlalchemy.engine.base.Engine show collation where Charset = 'utf8mb4' and Collation = 'utf8mb4_bin' pod-app-hostname INFO:sqlalchemy.engine.base.Engine:show collation where Charset = 'utf8mb4' and Collation = 'utf8mb4_bin' pod-app-hostname 2022-11-03 19:19:30,885 INFO sqlalchemy.engine.base.Engine () pod-app-hostname INFO:sqlalchemy.engine.base.Engine:() pod-app-hostname 2022-11-03 19:19:30,887 INFO sqlalchemy.engine.base.Engine SELECT CAST('test plain returns' AS CHAR(60)) AS anon_1 pod-app-hostname INFO:sqlalchemy.engine.base.Engine:SELECT CAST('test plain returns' AS CHAR(60)) AS anon_1 pod-app-hostname 2022-11-03 19:19:30,887 INFO sqlalchemy.engine.base.Engine () pod-app-hostname INFO:sqlalchemy.engine.base.Engine:() pod-app-hostname 2022-11-03 19:19:30,889 INFO sqlalchemy.engine.base.Engine SELECT CAST('test unicode returns' AS CHAR(60)) AS anon_1 pod-app-hostname INFO:sqlalchemy.engine.base.Engine:SELECT CAST('test unicode returns' AS CHAR(60)) AS anon_1 pod-app-hostname INFO:sqlalchemy.engine.base.Engine:() pod-app-hostname 2022-11-03 19:19:30,889 INFO sqlalchemy.engine.base.Engine () pod-app-hostname INFO:sqlalchemy.engine.base.Engine:SELECT CAST('test collated returns' AS CHAR CHARACTER SET utf8mb4) COLLATE utf8mb4_bin AS anon_1 pod-app-hostname INFO:sqlalchemy.engine.base.Engine:() pod-app-hostname 2022-11-03 19:19:30,891 INFO sqlalchemy.engine.base.Engine SELECT CAST('test collated returns' AS CHAR CHARACTER SET utf8mb4) COLLATE utf8mb4_bin AS anon_1 pod-app-hostname 2022-11-03 19:19:30,891 INFO sqlalchemy.engine.base.Engine () pod-app-hostname DEBUG:sqlalchemy.pool.impl.NullPool:Connection <_mysql.connection open to '1.2.3.4' at 0x55beb238bc50> checked out from pool pod-app-hostname 2022-11-03 19:19:30,895 DEBUG sqlalchemy.pool.impl.NullPool Connection <_mysql.connection open to '1.2.3.4' at 0x55beb238bc50> checked out from pool pod-app-hostname DEBUG:sqlalchemy.pool.impl.NullPool:Pool pre-ping on connection <_mysql.connection open to '1.2.3.4' at 0x55beb238bc50> pod-app-hostname 2022-11-03 19:19:30,895 DEBUG sqlalchemy.pool.impl.NullPool Pool pre-ping on connection <_mysql.connection open to '1.2.3.4' at 0x55beb238bc50> pod-app-hostname 2022-11-03 19:19:30,901 DEBUG sqlalchemy.pool.impl.NullPool Connection <_mysql.connection open to '1.2.3.4' at 0x55beb238bc50> being returned to pool pod-app-hostname DEBUG:sqlalchemy.pool.impl.NullPool:Connection <_mysql.connection open to '1.2.3.4' at 0x55beb238bc50> being returned to pool pod-app-hostname 2022-11-03 19:19:30,902 DEBUG sqlalchemy.pool.impl.NullPool Connection <_mysql.connection open to '1.2.3.4' at 0x55beb238bc50> rollback-on-return pod-app-hostname DEBUG:sqlalchemy.pool.impl.NullPool:Connection <_mysql.connection open to '1.2.3.4' at 0x55beb238bc50> rollback-on-return pod-app-hostname 2022-11-03 19:19:30,903 DEBUG sqlalchemy.pool.impl.NullPool Closing connection <_mysql.connection open to '1.2.3.4' at 0x55beb238bc50> pod-app-hostname DEBUG:sqlalchemy.pool.impl.NullPool:Closing connection <_mysql.connection open to '1.2.3.4' at 0x55beb238bc50>

Screenshots

If applicable, add screenshots to help explain your problem.

Environment

(please complete the following information):

Checklist

Make sure to follow these steps before submitting your issue - thank you!

Additional context

I'm using an image derived from apache/superset:2.0.0 and tried to connect both to Hive and to MySQL.

unnyns-307 commented 1 year ago

Hi, we are facing the same issue. @diogenes1oliveira did you find any fix or workaround?

dingbat commented 12 months ago

Enabling AQE for a database only applies to SQL Lab - for charts you'll need to enable the GLOBAL_ASYNC_QUERIES feature flag and add corresponding config (see docs here: https://github.com/apache/superset/blob/master/CONTRIBUTING.md#async-chart-queries)

shauryagoel commented 7 months ago

Hi @dingbat, is there any doc or guide which shows how to use GLOBAL_ASYNC_QUERIES? I tried setting this to true, but, I started getting getting jwt token error. After setting GLOBAL_ASYNC_QUERIES_JWT_SECRET to some random 32 bit number, I am getting these types of logs continuously- 10.160.15.225 - - [15/Feb/2024:06:59:18 +0000] "GET /api/v1/async_event/ HTTP/1.1" 403 24 "http://IP/dashboard/45/?native_filters_key=RkL3s6EjCpRaxau_afgK1VBnmRh5Qx1pfj8z4UlRXoDyVCKAq6IWEhlVxvXEbzu5" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36"

jturkar commented 4 months ago

Hi @shauryagoel, did you fix the issue related to setting up the GLOBAL_ASYNC_QUERIES flag in superset_config.py file for charts asynchronous query executions task in Celery? If yes, can I know the solution?

talfour commented 3 months ago

Hi @shauryagoel, did you fix the issue related to setting up the GLOBAL_ASYNC_QUERIES flag in superset_config.py file for charts asynchronous query executions task in Celery? If yes, can I know the solution?

You have to set same value for GLOBAL_ASYNC_QUERIES_REDIS_CONFIG in superset_config.py as in config.json that is used by superset-websocket For example:

GLOBAL_ASYNC_QUERIES_REDIS_CONFIG = {
    "port": 6379,
    "host": "redis",
    "db": 1,
    "ssl": False
}
Tim-Hodge commented 1 month ago

Also wondering if there is a guide on using GLOBAL_ASYNC_QUERIES and monitoring/tuning async query performance. It looks like the docs at https://github.com/apache/superset/blob/master/CONTRIBUTING.md#async-chart-queries were moved.

rusackas commented 1 month ago

@Tim-Hodge the GAQ feature was effectively in beta, and is about to go under a revamp. See SIP-143 for more details/discussion/voting (CC @villebro re: revisiting docs as part of the SIP... happy to help here if needed).