apache / superset

Apache Superset is a Data Visualization and Data Exploration Platform
https://superset.apache.org/
Apache License 2.0
62.72k stars 13.85k forks source link

Docker image fails to run after #27505 #27573

Open rscarborough1996 opened 7 months ago

rscarborough1996 commented 7 months ago

Bug description

Docker image apache/superset:56a6660 (This is the merge commit from #27505) fails to run because of missing dependencies.

Adding this to Dockerfile fixes all of the issues:

RUN pip install psycopg2
RUN pip install -U flask-cors
RUN pip install Pillow

How to reproduce the bug

Try to run the apache/superset:56a6660 docker image.

Screenshots/recordings

No response

Superset version

master / latest-dev

Python version

3.9

Node version

16

Browser

Chrome

Additional context

No response

Checklist

dianper commented 7 months ago

I had the same problem, but after installing these dependencies it works.

mistercrunch commented 7 months ago

Curious to get more context, I'm guessing you are using lean image and using postgres (?)

From my understanding this didn't work before and still wouldn't work today but guessing I'm wrong given the report.

rscarborough1996 commented 7 months ago

This is the image on Dockerhub that I am using: https://hub.docker.com/layers/apache/superset/56a6660/images/sha256-c29c2a221d0cf9e49e56bbc1ceb9a7b1bf9693d74fd8300b8181db0ebe28fdde?context=explore

I believe this is a lean image, and I am using Postgres. As I said, this image is Superset at the state where #27505 was merged. Before this, the dependencies seemed to be included. For reference, this is an example of my Dockerfile:

FROM apache/superset:56a6660
USER root

# These dependencies now need to be installed
RUN pip install psycopg2
RUN pip install -U flask-cors
RUN pip install Pillow

RUN pip install pymssql
ADD ./superset_config.py /app/pythonpath/
RUN chown superset:superset /app/pythonpath/superset_config.py

USER superset
mistercrunch commented 7 months ago

The lean image does not have Postgres drivers, and didn't have it in the past either AFAIK. I'm debating whether we should have a fit and maybe fat image to complement lean. Fit would have postgres driver and maybe the top 5-10 drivers on top of lean. More on tags and how they relate to layers here -> https://superset.apache.org/docs/installation/docker

rscarborough1996 commented 7 months ago

All I can say for certain is that I have not needed to install those additional dependencies up to this point. I use 3.1.0 for most deployments, and those pip installs are not included in the Dockerfile. I was testing something on a newer commit when I ran into this issue, and I tracked the change back to #27505. Running from the previous merge commit's image (apache/superset:f4bdcb5) does not require the pip installs. I encourage you to try this for yourself.

It makes sense to me that these dependencies would need to be manually added to a lean image, and I like the idea for larger images as well, especially for people trying out Superset for the first time. Missing dependencies can be a turn off!

mistercrunch commented 7 months ago

Thanks for reporting it. Maybe the base images pointed to the dev layer before I untangled some of this, meaning you had many more things in there before.

I'm tempted to create a fit layer with more libraries that people are likely to need in production.