apache / superset

Apache Superset is a Data Visualization and Data Exploration Platform
https://superset.apache.org/
Apache License 2.0
62.69k stars 13.83k forks source link

Environment variable SUPERSET_DATABASE_URI #30187

Open Vucko95 opened 2 months ago

Vucko95 commented 2 months ago

Bug description

When user defines SUPERSET_DATABASE_URI, connection between Superset and DB is not automatically set.

I set this env variable SUPERSET_DATABASE_URI=postgresql://superset:superset_password@postgres:5432/superset_db and when I run my Docker-compose , Superset and Posgresql are up but, When I log to Superset UI , Posgresql connection is not automatically set. When I try to test connection manually it is working.(Check the picture, My main goal is skip this manual step)

How to reproduce the bug

`version: '3' services:

superset5: build: context: ./superset dockerfile: dockerfile container_name: superset5 environment:

Screenshots/recordings

image

Superset version

master / latest-dev

Python version

3.9

Node version

16

Browser

Chrome

Additional context

No response

Checklist

dosubot[bot] commented 2 months ago

Hey @Vucko95! I'm here to help you with any bugs, questions, or contributions you have for Superset. Let's tackle this issue together!

The connection between Superset and the database is not automatically set when the SUPERSET_DATABASE_URI environment variable is defined in the Docker-compose file because the SUPERSET_DATABASE_URI environment variable is not being used in the superset_config.py file. Instead, the connection string is constructed using individual environment variables like DATABASE_DIALECT, DATABASE_USER, DATABASE_PASSWORD, DATABASE_HOST, DATABASE_PORT, and DATABASE_DB.

Here is the relevant code snippet from superset_config.py:

DATABASE_DIALECT = os.getenv("DATABASE_DIALECT")
DATABASE_USER = os.getenv("DATABASE_USER")
DATABASE_PASSWORD = os.getenv("DATABASE_PASSWORD")
DATABASE_HOST = os.getenv("DATABASE_HOST")
DATABASE_PORT = os.getenv("DATABASE_PORT")
DATABASE_DB = os.getenv("DATABASE_DB")

# The SQLAlchemy connection string.
SQLALCHEMY_DATABASE_URI = (
    f"{DATABASE_DIALECT}://"
    f"{DATABASE_USER}:{DATABASE_PASSWORD}@"
    f"{DATABASE_HOST}:{DATABASE_PORT}/{DATABASE_DB}"
)

To use the SUPERSET_DATABASE_URI environment variable directly, you would need to modify the superset_config.py file to use it:

SQLALCHEMY_DATABASE_URI = os.getenv("SUPERSET_DATABASE_URI")

This change would allow the SUPERSET_DATABASE_URI environment variable to directly set the connection string [1].

To continue talking to Dosu, mention @dosu.

Vucko95 commented 2 months ago

@dosu I have added line to superset_config.py My ENV VAR: SQLALCHEMY_DATABASE_URI = "postgresql://superset:superset_password@postgres:5432/superset_db"

Still when containers are up, I have to connect Superset and PostgreSQL manually, its not automatically connected.

dosubot[bot] commented 2 months ago

Adding the SQLALCHEMY_DATABASE_URI to superset_config.py is a good step, but it seems the connection is still not being established automatically. Here are a few things to check:

  1. Ensure superset_config.py is Loaded: Make sure that your superset_config.py file is in the PYTHONPATH and is being correctly loaded by Superset. You can verify this by adding a print statement or logging in the superset_config.py to confirm it is being executed.

  2. Database Initialization: Superset requires the database to be initialized. Ensure that the initialization commands are run after the containers are up. You can add the following commands to your Docker setup to initialize the database:

    superset db upgrade
    superset init
  3. Environment Variables in Docker: If you are using Docker, ensure that the environment variables are correctly set in the Docker environment. You can add the environment variable directly in your Dockerfile or Docker Compose file.

  4. Check for Errors: Look at the Superset logs for any errors related to database connections. This can provide more insight into why the connection is not being established automatically.

Here is an example of how you might modify your Docker Compose file to include the initialization steps:

version: '3'
services:
  superset5:
    build:
      context: ./superset
      dockerfile: dockerfile
    container_name: superset5
    environment:
      - ADMIN_USERNAME=admin
      - ADMIN_EMAIL=admin@superset.com
      - ADMIN_PASSWORD=admin
      - SUPERSET_SECRET_KEY=super_ssecret 
      - SQLALCHEMY_DATABASE_URI=postgresql://superset:superset_password@postgres:5432/superset_db
    ports:
      - '8088:8088'
    depends_on:
      - postgres
    command: >
      /bin/bash -c "
      superset db upgrade &&
      superset init &&
      superset run -p 8088 --with-threads --reload --debugger
      "

  postgres:
    image: postgres:14
    container_name: postgres
    environment:
      POSTGRES_USER: superset
      POSTGRES_PASSWORD: superset_password
      POSTGRES_DB: superset_db
    ports:
      - '5432:5432'
    volumes:
      - postgres_data:/var/lib/postgresql/data
      - ./init.sql:/docker-entrypoint-initdb.d/init.sql

volumes:
  postgres_data:

This setup ensures that the database is upgraded and initialized before Superset starts running.

To continue talking to Dosu, mention @dosu.

Vucko95 commented 2 months ago

@dosu Thank you adding this line to script solved the issue. superset set_database_uri --database_name my_postgres --uri "$SUPERSET_DATABASE_URI"

dosubot[bot] commented 2 months ago

The issue might be due to the superset-init.sh script not setting up the database connection automatically. The superset_config.py file should be correctly configured, but the connection setup might still require manual intervention.

To ensure the connection is set up automatically, you can modify the superset-init.sh script to include the database connection setup. Here’s how you can do it:

  1. Modify superset-init.sh: Add a step to create the database connection using the Superset CLI.
#!/bin/bash

# Create Admin user
superset fab create-admin --username "$ADMIN_USERNAME" --firstname Superset --lastname Admin --email "$ADMIN_EMAIL" --password "$ADMIN_PASSWORD"

# Upgrading Superset metastore
superset db upgrade

# Setup roles and permissions
superset init

# Create database connection
superset set_database_uri --database_name my_postgres --uri "$SQLALCHEMY_DATABASE_URI"

# Starting server
/bin/sh -c /usr/bin/run-server.sh
  1. Ensure Environment Variables: Make sure the SQLALCHEMY_DATABASE_URI environment variable is correctly set in your Dockerfile or Docker Compose file.

  2. Update Dockerfile: Ensure the superset-init.sh script is executable and the environment variables are correctly passed.

FROM apache/superset:latest
USER root
RUN apt-get -y update
RUN pip install pip 
RUN pip install sqlalchemy-redshift
RUN pip install psycopg2
ENV ADMIN_USERNAME $ADMIN_USERNAME
ENV ADMIN_EMAIL $ADMIN_EMAIL
ENV ADMIN_PASSWORD $ADMIN_PASSWORD
ENV SQLALCHEMY_DATABASE_URI "postgresql://superset:superset_password@postgres:5432/superset_db"
COPY ./superset-init.sh /superset-init.sh
RUN chmod +x /superset-init.sh

COPY superset_config.py /app/
ENV SUPERSET_CONFIG_PATH /app/superset_config.py

USER superset
ENTRYPOINT [ "/superset-init.sh" ]

This setup ensures that the database connection is created automatically during the initialization process. The superset set_database_uri command sets the database connection URI, which should eliminate the need for manual setup through the UI.

To continue talking to Dosu, mention @dosu.

rusackas commented 2 months ago

@Vucko95 glad to hear it worked out. Do you think we need to add that fix (superset set_database_uri --database_name my_postgres --uri "$SQLALCHEMY_DATABASE_URI") somewhere in the documentation? Or is there anything else needed to close this one out?