apache / superset

Apache Superset is a Data Visualization and Data Exploration Platform
https://superset.apache.org/
Apache License 2.0
62.41k stars 13.72k forks source link

Async queries and Cache not working as expected #30650

Open jorgehn52 opened 5 days ago

jorgehn52 commented 5 days ago

Bug description

Description

I find my superset instance not working properly the Async queries and Cache config seems messed up. When opening Superset for the first time, all dashboards are stuck on loading for whatever time you stay there, then when I change to Datasets or apply a different value in filters and then comeback to the original dashboard, it will be loaded making seem like cache is working intermittently, as sometimes cached data seems to be wrong.

If I take one of the queries from the Dashboard and run it in SQL.Lab, it will work correctly and take 3 seconds approximately, but in the dashboard it will stay loading forever.

Evidence

Environment

Values file

(used by helm chart)

superset:
  imagePullSecrets:
    - name: artifactory
  resources:
    # We usually recommend not to specify default resources and to leave this as a conscious
    # choice for the user. This also increases chances charts run on environments with little
    # resources, such as Minikube. If you do want to specify resources, uncomment the following
    # lines, adjust them as necessary, and remove the curly braces after 'resources:'.
    # The limits below will apply to all Superset components. To set individual resource limitations refer to the pod specific values below.
    # The pod specific values will overwrite anything that is set here.
    limits:
      cpu: 1000m
      memory: 2000Mi
    requests:
      cpu: 500m
      memory: 1000Mi

  image:
    repository: (edited)
    tag: 4.0.2
    pullPolicy: "IfNotPresent"
    pullSecrets:
      - artifactory

  initImage:
    repository: (edited)/jwilder/dockerize
    tag: "latest"
    pullPolicy: "IfNotPresent"
    pullSecrets:
      - artifactory

  postgresql:
    # Skip to use own managed instance instead
    enabled: false
    image:
      registry: (edited)
      repository: bitnami/postgresql
      tag: "14.6.0-debian-11-r13"
      pullPolicy: "IfNotPresent"
      pullSecrets:
        - artifactory

  redis:
    image:
      registry: (edited)
      repository: bitnami/redis
      tag: 7.2.5-debian-12-r2
      pullPolicy: "IfNotPresent"
      pullSecrets:
        - artifactory

  extraEnv:
    AZURE_APPLICATION_ID: (edited)
    AZURE_TENANT_ID: (edited)
    SMTP_HOST: (edited)
    SMTP_MAIL_FROM: (edited)
    SMTP_PASSWORD: null
    SMTP_PORT: 25
    SMTP_USER: null
    DB_SCHEMA: superset
    EMAIL_REPORTS_SUBJECT_PREFIX:
      -  Metrics Report
    TALISMAN_ENABLED: false

  configOverrides:
    enable_oauth: |
      # This will make sure the redirect_uri is properly computed, even with SSL offloading
      ENABLE_PROXY_FIX = True

      from flask_appbuilder.security.manager import AUTH_OAUTH
      AUTH_TYPE = AUTH_OAUTH
      OAUTH_PROVIDERS = [
          {
              "name": "azure",
              "icon": "fa-windows",
              "token_key": "access_token",
              "remote_app": {
                  "client_id": os.getenv("AZURE_APPLICATION_ID"),
                  "client_secret": os.getenv("AZURE_SECRET"),
                  "api_base_url": f"https://login.microsoftonline.com/{env('AZURE_TENANT_ID')}/oauth2",
                  "client_kwargs": {
                    "scope": "User.read name preferred_username email profile upn",
                    "resource": os.getenv("AZURE_APPLICATION_ID"),
                  },
                  "request_token_url": None,
                  "access_token_url": f"https://login.microsoftonline.com/{env('AZURE_TENANT_ID')}/oauth2/token",
                  "authorize_url": f"https://login.microsoftonline.com/{env('AZURE_TENANT_ID')}/oauth2/authorize",
              },
          }
      ]
      HTML_SANITIZATION_SCHEMA_EXTENSIONS = {
        "attributes": {
            "*": ["style","className", "class"],
        },
        "tagNames": ["style"]
      }

      # Map Authlib roles to superset roles
      AUTH_ROLE_ADMIN = 'Admin'
      AUTH_ROLE_PUBLIC = 'Public'

      # Will allow user self registration, allowing to create Flask users from Authorized User
      AUTH_USER_REGISTRATION = True

      # The default user self registration role
      AUTH_USER_REGISTRATION_ROLE = "Gamma"

    extra_features: |
      import ast
      CORS_OPTIONS = {
        "supports_credentials": True,
        "allow_headers": ['*'],
        "resources": ['*'],
        "origins": [
          "http://localhost:8088", 
          "http://localhost:8888",
          "https:\/\/.*\.(edited)\.com"
        ]
      }
      FEATURE_FLAGS = {
        "ALERT_REPORTS": True,
        "DASHBOARD_RBAC": True,
        "EMBEDDABLE_CHARTS": True,
        "EMBEDDED_SUPERSET": True,
        "HORIZONTAL_FILTER_BAR": True,
        "DRILL_BY": True,
        "DRILL_TO_DETAIL": True,
        "ENABLE_TEMPLATE_PROCESSING": True,
        "DASHBOARD_VIRTUALIZATION": True,
        "GLOBAL_ASYNC_QUERIES": True
      }

    celery_conf: |
      from celery.schedules import crontab

      class CeleryConfig(object):
        broker_url = f"redis://{env('REDIS_HOST')}:{env('REDIS_PORT')}/0"
        imports = ('superset.sql_lab', "superset.tasks.scheduler", )
        result_backend = f"redis://{env('REDIS_HOST')}:{env('REDIS_PORT')}/0"
        worker_prefetch_multiplier = 10
        task_acks_late = True
        task_annotations = {
            'sql_lab.get_sql_results': {
                'rate_limit': '100/s',
            },
        }
      CELERY_CONFIG = CeleryConfig

      GLOBAL_ASYNC_QUERIES_JWT_SECRET = os.getenv("JWT_SECRET")
      GLOBAL_ASYNC_QUERIES_TRANSPORT: 'polling'
      GLOBAL_ASYNC_QUERIES_POLLING_DELAY: 50

      from flask_caching.backends.rediscache import RedisCache
      RESULTS_BACKEND = RedisCache(
          host=f"{env('REDIS_HOST')}", port=f"{env('REDIS_PORT')}", key_prefix='superset_results')
      RESULTS_BACKEND_USE_MSGPACK = True

      CACHE_NO_NULL_WARNING = True
      CACHE_CONFIG = {
          'CACHE_TYPE': 'RedisCache',
          'CACHE_DEFAULT_TIMEOUT': 86400,
          'CACHE_KEY_PREFIX': 'superset_cache',
          'CACHE_REDIS_URL': f"redis://{env('REDIS_HOST')}:{env('REDIS_PORT')}/0"
      }
      DATA_CACHE_CONFIG = {
          'CACHE_TYPE': 'RedisCache',
          'CACHE_DEFAULT_TIMEOUT': 86400,
          'CACHE_KEY_PREFIX': 'superset_results',
          'CACHE_REDIS_URL': f"redis://{env('REDIS_HOST')}:{env('REDIS_PORT')}/0"
      }

      # WebDriver configuration
      EMAIL_PAGE_RENDER_WAIT = 60
      WEBDRIVER_BASEURL = "http://{{ template "superset.fullname" . }}:{{ .Values.service.port }}/"
      WEBDRIVER_BASEURL_USER_FRIENDLY = "${hostname}"

    smtp: |
      import ast
      SMTP_HOST = os.getenv("SMTP_HOST")
      SMTP_PORT = os.getenv("SMTP_PORT")
      SMTP_STARTTLS = ast.literal_eval(os.getenv("SMTP_STARTTLS", "True"))
      SMTP_SSL = ast.literal_eval(os.getenv("SMTP_SSL", "False"))
      SMTP_USER = os.getenv("SMTP_USER")
      SMTP_PASSWORD = os.getenv("SMTP_PASSWORD")
      SMTP_MAIL_FROM = os.getenv("SMTP_MAIL_FROM")
      EMAIL_REPORTS_SUBJECT_PREFIX = os.getenv("EMAIL_REPORTS_SUBJECT_PREFIX")

    general: |
      SQLALCHEMY_DATABASE_URI = f"postgresql+psycopg2://{env('DB_USER')}:{env('DB_PASS')}@{env('DB_HOST')}:{env('DB_PORT')}/{env('DB_NAME')}?options=-csearch_path={env('DB_SCHEMA')}"
      SQLALCHEMY_POOL_SIZE = 20
      SQLALCHEMY_MAX_OVERFLOW = 40
      SQLALCHEMY_POOL_TIMEOUT = 180

    secret: |
      SECRET_KEY = os.getenv("SECRET_KEY")
      PREVIOUS_SECRET_KEY = os.getenv("PREVIOUS_SECRET_KEY")

  bootstrapScript: |
    #!/bin/bash
    rm -rf /var/lib/apt/lists/* && \
    pip install \
    psycopg2-binary==2.9.1 \
    Authlib && \
    if [ ! -f ~/bootstrap ]; then echo "Running Superset with uid {{ .Values.runAsUser }}" > ~/bootstrap; fi

  supersetWorker:
    replicaCount: 2
    command:
      - /bin/sh
      - -c
      - |
        # Install firefox webdriver
        # See https://github.com/apache/superset/blob/4fa3b6c7185629b87c27fc2c0e5435d458f7b73d/docs/src/pages/docs/installation/email_reports.mdx
        apt update
        apt-get install --no-install-recommends -y firefox-esr wget

        wget -q https://github.com/mozilla/geckodriver/releases/download/v${geckodriver_version}/geckodriver-v${geckodriver_version}-linux64.tar.gz
        tar -x geckodriver -zf geckodriver-v${geckodriver_version}-linux64.tar.gz -O > /usr/bin/geckodriver
        chmod 755 /usr/bin/geckodriver
        rm geckodriver-v${geckodriver_version}-linux64.tar.gz

        # Run
        . {{ .Values.configMountPath }}/superset_bootstrap.sh; celery --app=superset.tasks.celery_app:app worker --pool=prefork -O fair -c 4
    forceReload: true

  ingress:
    enabled: true
    ingressClassName: nginx
    annotations:
      "cert-manager.io/cluster-issuer": "(edited)"
    path: /
    pathType: ImplementationSpecific
    extraHostsRaw: []

  # Superset beat configuration (to trigger scheduled jobs like reports)
  supersetCeleryBeat:
    # -- This is only required if you intend to use alerts and reports
    enabled: true

Screenshots/recordings

Dashboard stuck: loading-dashboard-2024-10-18 145039

Correct connectivity to database: query-working-2024-10-18 130815

Superset version

4.0.2

Python version

3.10

Node version

I don't know

Browser

Chrome

Additional context

No response

Checklist

dosubot[bot] commented 5 days ago

Hey there, @jorgehn52! đź‘‹ I'm here to help you with any bugs, questions, or even becoming a contributor. Let's squash those bugs together!

I found a similar unsolved discussion regarding async queries in Superset, which might be related to the issue you're experiencing. You can check out the discussion titled Async Queries in Dashboards & Charts [1].

Additionally, there was a closed issue titled Async queries sometimes fail when retrieving cached query results, which was addressed by a pull request that aimed to fix unintended cache misses with async queries [2]. This might provide some insights or solutions relevant to your problem.

For your specific issue, ensure that your Redis configuration is correctly set up in the superset_config.py file. Here are some key configurations to check:

  1. Cache Configuration: Ensure that the CACHE_CONFIG and DATA_CACHE_CONFIG are correctly pointing to your Redis instance and that the CACHE_TYPE is set to 'RedisCache'.

  2. Async Queries Configuration: Verify that the GLOBAL_ASYNC_QUERIES feature flag is enabled and that the GLOBAL_ASYNC_QUERIES_TRANSPORT is set to 'polling' or 'ws' as per your setup.

  3. Celery Configuration: Make sure that the Celery broker and result backend are correctly configured to use Redis, and that the Celery workers are running properly.

  4. JWT Secret: Ensure that the GLOBAL_ASYNC_QUERIES_JWT_SECRET is set correctly, as this is crucial for async query authentication.

If the issue persists, you might want to look into the specific error messages or logs related to async queries and caching in Superset 4.0.2, which can provide more detailed insights into what might be going wrong [3][4].

To continue talking to Dosu, mention @dosu.

jorgehn52 commented 2 days ago

Hello @dosu Yes those configurations are in place. And still, not working. As an example: I have a dashboard, that is taking ~20 seconds to load, within that dashboard there are graphs in where the SQL query takes only 3 seconds. That is why I was configuring Async, hoping that gets resolved.

I also noticed that turning "GLOBAL_ASYNC_QUERIES" to True, makes the graphs in the dashboard to load forever. EDIT: I also noticed that even after making sure that redis-headless is up and configured, some calles are trying to reach redis in "127.0.0.1:6379"

padbk commented 1 day ago

Can't see GLOBAL_ASYNC_QUERIES_REDIS_CONFIG in your config. Try adding it and pointing it at your Redis instance.

https://github.com/apache/superset/blob/f11fa091e261a35f4d39d8567a859fad07547d84/superset/config.py#L1536

dosubot[bot] commented 1 day ago

It seems like there might be a misconfiguration in your Redis setup. Superset determines the Redis host and port using the CACHE_REDIS_HOST and CACHE_REDIS_PORT configuration keys. If these are not explicitly set, it defaults to "redis" for the host and 6379 for the port [1][2].

To resolve the issue, ensure that your Redis configuration in superset_config.py or your Helm chart values file is correctly pointing to the Redis service you intend to use. Specifically, check that the CACHE_REDIS_HOST and CACHE_REDIS_PORT are set to the correct values for your environment, rather than defaulting to 127.0.0.1:6379.

Additionally, verify that the GLOBAL_ASYNC_QUERIES_REDIS_CONFIG is also correctly configured to point to your Redis instance. This should help ensure that Superset is connecting to the correct Redis server for caching and async queries.

To continue talking to Dosu, mention @dosu.

padbk commented 1 day ago

@dosu CACHE_REDIS_HOST and CACHE_REDIS_PORT only applies on latest master where GLOBAL_ASYNC_QUERIES_CACHE_BACKEND is the new flag to use.

For 4.1.0 and below GLOBAL_ASYNC_QUERIES_REDIS_CONFIG is needed.

jorgehn52 commented 1 day ago

Hello @padbk. Thank you for your message and suggestion.

So I added the portion I was missing. And also enabled the debug mode to the worker. In the UI it seems to still taking ~20 seconds to load, but I see in the logs is that it is also taking long in the "load_chart_data_into_cache". Taking ~20 seconds for some of them.

image

image

Also to mention that I tested it with 4.0.1 and 4.0.2 versio with same 0.12.11 helm chart version This is how my values file look line now.

`
FEATURE_FLAGS = { "ALERT_REPORTS": True, "DASHBOARD_RBAC": True, "EMBEDDABLE_CHARTS": True, "EMBEDDED_SUPERSET": True, "HORIZONTAL_FILTER_BAR": True, "DRILL_BY": True, "DRILL_TO_DETAIL": True, "ENABLE_TEMPLATE_PROCESSING": True, "DASHBOARD_VIRTUALIZATION": False, "GLOBAL_ASYNC_QUERIES": True } celery_conf: | from celery.schedules import crontab import redis REDIS_HOST = os.getenv("REDIS_HOST") REDIS_PORT = os.getenv("REDIS_PORT") REDIS_CELERY_DB = os.getenv("REDIS_CELERY_DB")

  class CeleryConfig(object):
    broker_url = f"redis://{REDIS_HOST}:{REDIS_PORT}/{REDIS_CELERY_DB}"
    imports = ('superset.sql_lab', "superset.tasks.thumbnails", "superset.tasks.scheduler",)
    result_backend = f"redis://{REDIS_HOST}:{REDIS_PORT}/{REDIS_CELERY_DB}"
    worker_prefetch_multiplier = 10
    task_acks_late = True
    task_annotations = {
        'sql_lab.get_sql_results': {
            'rate_limit': '100/s',
        },
    }
  CELERY_CONFIG = CeleryConfig

  CACHE_CONFIG = {
        'CACHE_TYPE': 'RedisCache',
        'CACHE_DEFAULT_TIMEOUT': 900,
        'CACHE_KEY_PREFIX': 'superset_results',
        'CACHE_REDIS_URL': f"redis://{REDIS_HOST}:{REDIS_PORT}/{REDIS_CELERY_DB}",
  }

  GLOBAL_ASYNC_QUERIES_JWT_SECRET = os.getenv("JWT_SECRET")
  GLOBAL_ASYNC_QUERIES_TRANSPORT: 'polling'
  GLOBAL_ASYNC_QUERIES_REDIS_CONFIG = {
      "port": f"{REDIS_PORT}",
      "host": f"{REDIS_HOST}",
      "password": "",
      "db": f"{REDIS_CELERY_DB}",
      "ssl": False,
  }

  from flask_caching.backends.rediscache import RedisCache
  RESULTS_BACKEND = RedisCache(host=f"{REDIS_HOST}", port=f"{REDIS_PORT}", key_prefix='superset_results')

  RESULTS_BACKEND_USE_MSGPACK = True
  CACHE_NO_NULL_WARNING = True

`

Is it ok to use REDIS_CELERY_DD which is 0 by default for every config? during installation, redis is only configured with: REDIS_DB=1 and REDIS_CELERY_DB=0 env values.

Do you have ay other recommendation? or seeing something odd in my config.. It has changed because of all post I have seen. Thanks

dosubot[bot] commented 1 day ago

To address the issue with async queries and caching, ensure that you have the GLOBAL_ASYNC_QUERIES_REDIS_CONFIG properly set in your configuration for Superset 4.0.2. This configuration should point to your Redis instance, ensuring that Superset is not defaulting to 127.0.0.1:6379.

Here's an example of how you might configure it:

GLOBAL_ASYNC_QUERIES_REDIS_CONFIG = {
    'host': 'your-redis-host',
    'port': 6379,
    'db': 0,
    'password': 'your-redis-password',  # if applicable
}

Make sure that this configuration is included in your superset_config.py or your Helm chart values file. This should help Superset connect to the correct Redis server for handling async queries and caching effectively. If you continue to experience issues, double-check the network connectivity and firewall settings to ensure that Superset can reach the Redis server at the specified host and port.

To continue talking to Dosu, mention @dosu.

padbk commented 1 day ago

We use same redis db for both