apache / superset

Apache Superset is a Data Visualization and Data Exploration Platform
https://superset.apache.org/
Apache License 2.0
62.29k stars 13.68k forks source link

ModuleNotFoundError: No module named 'cryptography.hazmat.backends.openssl.x509' #24203

Closed lemaadi closed 11 months ago

lemaadi commented 1 year ago

Hi there,

I am deploying superset on AWS ECS cluster ( app, worker, worker beat ) using docker image apache/superset. It is working fine with version 2.0.0 but i need to enable Embed Dashboard feature so i upgraded the docker image to version 2.1.0. All works fine locally but after building/pushing the docker image to AWS ECR and recreating the services based on that image, the services couldn't reach a healthy status due to following error :

Traceback (most recent call last):
--
File "/usr/local/lib/python3.8/site-packages/gunicorn/arbiter.py", line 589, in spawn_worker
worker.init_process()
File "/usr/local/lib/python3.8/site-packages/gunicorn/workers/ggevent.py", line 146, in init_process
super().init_process()
File "/usr/local/lib/python3.8/site-packages/gunicorn/workers/base.py", line 134, in init_process
self.load_wsgi()
File "/usr/local/lib/python3.8/site-packages/gunicorn/workers/base.py", line 146, in load_wsgi
self.wsgi = self.app.wsgi()
File "/usr/local/lib/python3.8/site-packages/gunicorn/app/base.py", line 67, in wsgi
self.callable = self.load()
File "/usr/local/lib/python3.8/site-packages/gunicorn/app/wsgiapp.py", line 58, in load
return self.load_wsgiapp()
File "/usr/local/lib/python3.8/site-packages/gunicorn/app/wsgiapp.py", line 48, in load_wsgiapp
return util.import_app(self.app_uri)
File "/usr/local/lib/python3.8/site-packages/gunicorn/util.py", line 359, in import_app
mod = importlib.import_module(module)
File "/usr/local/lib/python3.8/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 961, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 843, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/app/superset/__init__.py", line 21, in <module>
from superset.app import create_app
File "/app/superset/app.py", line 23, in <module>
from superset.initialization import SupersetAppInitializer
File "/app/superset/initialization/__init__.py", line 50, in <module>
from superset.security import SupersetSecurityManager
File "/app/superset/security/__init__.py", line 17, in <module>
from superset.security.manager import SupersetSecurityManager  # noqa: F401
File "/app/superset/security/manager.py", line 66, in <module>
from superset.utils.core import DatasourceName, RowLevelSecurityFilterType
File "/app/superset/utils/core.py", line 72, in <module>
from cryptography.hazmat.backends.openssl.x509 import _Certificate
ModuleNotFoundError: No module named 'cryptography.hazmat.backends.openssl.x509'
[2023-05-24 17:18:32 +0000] [25] [INFO] Worker exiting (pid: 25)
[2023-05-24 17:18:32 +0000] [29] [ERROR] Exception in worker process

Note that i am using the following :

FROM  apache/superset:2.1.0

# We switch to root
USER root

ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1
ARG TARGETPLATFORM
ARG TARGETARCH
ARG TARGETOS
ARG TARGETVARIANT
ARG ARCH
# We tell Superset where to find it
ENV SUPERSET_CONFIG_PATH /app/superset_config.py

RUN set -ex \
    && apt-get update \
    && apt-get install -qq -y --no-install-recommends \
    sudo \
    make \
    unzip \
    curl \
    jq \
    git \
    nano \
    && rm -rf /var/lib/apt/lists/*

WORKDIR /app

ENV TINI_VERSION v0.19.0
RUN curl --show-error --location --output /tini https://github.com/krallin/tini/releases/download/${TINI_VERSION}/tini-amd64
RUN chmod +x /tini

RUN case ${TARGETARCH} in arm64|arm/v7) ARCH="aarch64" ;; amd64) ARCH="x86_64"  ;; esac && \
    curl --silent --show-error https://awscli.amazonaws.com/awscli-exe-linux-${ARCH}.zip -o /tmp/awscliv2.zip && \
    curl --silent --show-error --location --output /tmp/amazon-ssm-agent.deb https://s3.eu-central-1.amazonaws.com/amazon-ssm-eu-central-1/latest/debian_${TARGETARCH}/amazon-ssm-agent.deb && \
    ls -ltrh /tmp/ && \
    unzip /tmp/awscliv2.zip && \
    dpkg -i /tmp/amazon-ssm-agent.deb && \
    sudo ./aws/install && \
    rm -rf /tmp/awscliv2.zip

# We install the Python interface for Redis
COPY local_requirements.txt ./
RUN pip install --upgrade pip \
    && pip install pystan==2.19.1.1 tqdm>=4.36.1 pymeeus ujson korean-lunar-calendar hijri-converter ephem convertdate setuptools-git LunarCalendar cmdstanpy \
    && pip install -r local_requirements.txt

RUN usermod -aG sudo superset
RUN mkdir /home/superset && \
    chown -R superset /home/superset

# We add the superset_config.py file to the container
COPY superset_config.py ./
# We tell Superset where to find it
COPY /docker/superset-entrypoint.sh /app/docker/
COPY /docker/docker-bootstrap.sh /app/docker/
COPY /docker/docker-init.sh /app/docker
COPY /docker/docker-entrypoint.sh /app/docker/

ADD . ./

RUN chown -R superset:superset /app && \
    chown -R superset:superset /etc/environment

RUN echo "superset ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers
# We switch back to the `superset` user
USER superset

ENTRYPOINT ["/tini", "-g", "--","/app/docker/docker-entrypoint.sh"]

- superset_config.py

import logging
import os
from datetime import timedelta
from typing import Optional

from cachelib.file import FileSystemCache
from celery.schedules import crontab
from cachelib.redis import RedisCache

logger = logging.getLogger()

def get_env_variable(var_name: str, default: Optional[str] = None) -> str:
    """Get the environment variable or raise exception."""
    try:
        return os.environ[var_name]
    except KeyError:
        if default is not None:
            return default
        else:
            error_msg = "The environment variable {} was missing, abort...".format(
                var_name
            )
            raise EnvironmentError(error_msg)

SECRET_KEY = get_env_variable("SUPERSET_SECRET_KEY")
DATABASE_DIALECT = get_env_variable("DATABASE_DIALECT")
DATABASE_USER = get_env_variable("DATABASE_USER")
DATABASE_PASSWORD = get_env_variable("DATABASE_PASSWORD")
DATABASE_HOST = get_env_variable("DATABASE_HOST")
DATABASE_PORT = get_env_variable("DATABASE_PORT")
DATABASE_DB = get_env_variable("DATABASE_DB")

# The SQLAlchemy connection string.
SQLALCHEMY_DATABASE_URI = "%s://%s:%s@%s:%s/%s" % (
    DATABASE_DIALECT,
    DATABASE_USER,
    DATABASE_PASSWORD,
    DATABASE_HOST,
    DATABASE_PORT,
    DATABASE_DB,
)

REDIS_HOST = get_env_variable("REDIS_HOST")
REDIS_PORT = get_env_variable("REDIS_PORT")
REDIS_CELERY_DB = get_env_variable("REDIS_CELERY_DB", "0")
REDIS_RESULTS_DB = get_env_variable("REDIS_RESULTS_DB", "1")

RESULTS_BACKEND = FileSystemCache("/app/superset_home/sqllab")

class CeleryConfig(object):
    BROKER_URL = f"redis://{REDIS_HOST}:{REDIS_PORT}/{REDIS_CELERY_DB}"
    CELERY_IMPORTS = ("superset.sql_lab", "superset.tasks")
    CELERY_RESULT_BACKEND = f"redis://{REDIS_HOST}:{REDIS_PORT}/{REDIS_RESULTS_DB}"
    CELERYD_LOG_LEVEL = "DEBUG"
    CELERYD_PREFETCH_MULTIPLIER = 1
    CELERY_ACKS_LATE = False
    CELERYBEAT_SCHEDULE = {
        "reports.scheduler": {
            "task": "reports.scheduler",
            "schedule": crontab(minute="*", hour="*"),
        },
        "reports.prune_log": {
            "task": "reports.prune_log",
            "schedule": crontab(minute=10, hour=0),
        },
    }

CELERY_CONFIG = CeleryConfig
# RESULTS_BACKEND = RedisCache(
#     host=REDIS_HOST, port=REDIS_PORT, key_prefix='superset_results')

# To enable embedding on a Superset instance, the following flags need to be configured as shown below:
FEATURE_FLAGS = {
    "ALERT_REPORTS": True,
    "EMBEDDED_SUPERSET": True
}
WTF_CSRF_ENABLED = False
# CORS Options
ENABLE_CORS = True

CORS_OPTIONS = {
    'supports_credentials': True,
    'allow_headers': ['*'],
    'resources': ['*'],
    'origins': ['*']
}
# Superset roles config
GUEST_ROLE_NAME = "Gamma"
PUBLIC_ROLE_LIKE_GAMMA = True

SESSION_COOKIE_HTTPONLY = False  # Prevent cookie from being read by frontend JS?
SESSION_COOKIE_SECURE = False  # Prevent cookie from being transmitted over non-tls?
SESSION_COOKIE_SAMESITE = "Lax"

ALERT_REPORTS_NOTIFICATION_DRY_RUN = True
WEBDRIVER_BASEURL = "http://superset:8088/"
# The base URL for the email report hyperlinks.
WEBDRIVER_BASEURL_USER_FRIENDLY = WEBDRIVER_BASEURL

SQLLAB_CTAS_NO_LIMIT = True

#
# Optionally import superset_config_docker.py (which will have been included on
# the PYTHONPATH) in order to allow for local settings to be overridden
#
try:
    import superset_config_docker
    from superset_config_docker import *  # noqa

    logger.info(
        f"Loaded your Docker configuration at " f"[{superset_config_docker.__file__}]"
    )
except ImportError:
    logger.info("Using default Docker config...")

local_requirements.txt

redis==3.5.3
psycopg2==2.9.6
snowflake-sqlalchemy==1.4.7
gunicorn
prophet==1.0.1
gevent==21.8.0
snowflake-connector-python==3.0.3

/docker/docker-bootstrap.sh

#!/usr/bin/env bash

set -eo pipefail

REQUIREMENTS_LOCAL="/app/docker/local_requirements.txt"
# If Cypress run – overwrite the password for admin and export env variables
if [ "$CYPRESS_CONFIG" == "true" ]; then
    export SUPERSET_CONFIG=tests.integration_tests.superset_test_config
    export SUPERSET_TESTENV=true
    export ENABLE_REACT_CRUD_VIEWS=true
    export SUPERSET__SQLALCHEMY_DATABASE_URI=postgresql+psycopg2://superset:superset@db:5432/superset
fi
#
# Make sure we have dev requirements installed
#
if [ -f "${REQUIREMENTS_LOCAL}" ]; then
  echo "Installing local overrides at ${REQUIREMENTS_LOCAL}"
  pip install -r "${REQUIREMENTS_LOCAL}"
else
  echo "Skipping local overrides"
fi

if [[ "${1}" == "worker" ]]; then
  echo "Starting Celery worker..."
  celery --app=superset.tasks.celery_app:app worker -O fair -l INFO
elif [[ "${1}" == "beat" ]]; then
  echo "Starting Celery beat..."
  celery --app=superset.tasks.celery_app:app beat --pidfile /tmp/celerybeat.pid -l INFO -s "${SUPERSET_HOME}"/celerybeat-schedule
elif [[ "${1}" == "app" ]]; then
  echo "Starting web app..."
  flask run -p 8088 --with-threads --reload --debugger --host=0.0.0.0
elif [[ "${1}" == "app-gunicorn" ]]; then
  echo "Starting web app..."
  /app/docker/superset-entrypoint.sh
fi

/docker/superset-entrypoint.sh

#!/bin/bash

set -eo pipefail

if [ "${#}" -ne 0 ]; then
    exec "${@}"
else
    gunicorn \
        --bind  "0.0.0.0:${SUPERSET_PORT}" \
        --access-logfile '-' \
        --error-logfile '-' \
        --workers 10 \
        --worker-class gevent \
        --threads 20 \
        --timeout ${GUNICORN_TIMEOUT:-60} \
        --limit-request-line 0 \
        --limit-request-field_size 0 \
        "${FLASK_APP}"
fi

"$@"

Thank you for your help and i am happy to provide more details if needed

lemaadi commented 1 year ago

Update :

When using superset version 2.0.0 i noticed that there is Version: 0.0.0dev instead of Version: 2.0.0 as i see when launching the app in my local machine.

sfirke commented 1 year ago

1) Have you reviewed the other Github issues that are returned when you search issues for cryptography.hazmat.backends.openssl.x509 ? This has been raised multiple times so please make sure your case is different (and those threads might help you solve this).

2) If you are seeing version 0.0.0dev that means you upgraded to a commit on Superset's master branch that is not associated with a stable release like 2.1.0.

sfirke commented 11 months ago

Closing this issue as stale. Please feel free to reopen if the problem persists and is unique (not in one of the other issues with this error message).