apache / superset

Apache Superset is a Data Visualization and Data Exploration Platform
https://superset.apache.org/
Apache License 2.0
61.6k stars 13.45k forks source link

ModuleNotFoundError: No module named 'sqlglot' error across many containers #26997

Closed andrekef closed 6 months ago

andrekef commented 7 months ago

Bug description

Working on a 2023 Macbook Air M2

I followed the steps per wiki here https://superset.apache.org/docs/installation/installing-superset-using-docker-compose/ (which I believe needs updating) and I am unable to have stable build for many containers under the main superset container.

Skipping local overrides
Starting web app (using development server)...
Skipping local overrides
Starting web app (using development server)...
Skipping local overrides
Starting web app (using development server)...
<jemalloc>: MADV_DONTNEED does not work (memset will be used instead)
<jemalloc>: (This is the expected behaviour if you are running under QEMU)
Usage: flask run [OPTIONS]
Try 'flask run --help' for help.

Error: While importing 'superset.app', an ImportError was raised:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/flask/cli.py", line 218, in locate_app
    __import__(module_name)
  File "/app/superset/__init__.py", line 21, in <module>
    from superset.app import create_app
  File "/app/superset/app.py", line 24, in <module>
    from superset.initialization import SupersetAppInitializer
  File "/app/superset/initialization/__init__.py", line 35, in <module>
    from superset.extensions import (
  File "/app/superset/extensions/__init__.py", line 30, in <module>
    from superset.async_events.async_query_manager import AsyncQueryManager
  File "/app/superset/async_events/async_query_manager.py", line 26, in <module>
    from superset.utils.core import get_user_id
  File "/app/superset/utils/core.py", line 90, in <module>
    from superset.sql_parse import sanitize_clause
  File "/app/superset/sql_parse.py", line 29, in <module>
    from sqlglot import exp, parse, parse_one
ModuleNotFoundError: No module named 'sqlglot'

This error above is looping endlessly

How to reproduce the bug

  1. git clone https://github.com/apache/superset.git
  2. cd superset
  3. open docker-compose.yml
  4. paste platform: linux/x86_64/v8 under each superset container - Please include in your wiki for arm64 users, as this will save folks a lot of time.

Eg

superset:
    platform: linux/x86_64/v8
    ...
  superset-websocket:
    platform: linux/amd64
    ...
  superset-init:
    platform: linux/x86_64/v8
    ...
   superset-node:
    platform: linux/x86_64/v8
    ...
   superset-worker:
    platform: linux/x86_64/v8
    ...
  superset-worker-beat:
    platform: linux/x86_64/v8
    ...
  superset-tests-worker:
    platform: linux/x86_64/v8
  1. run docker compose up --build -d
  2. Notice the ModuleNotFoundError

Screenshots/recordings

Screenshot 2024-02-02 at 1 15 27 PM

Superset version

master / latest-dev

Python version

3.9

Node version

18 or greater

Browser

Chrome

Additional context

No response

Checklist

vikramwalia commented 7 months ago

+1 having the same exact issue.

louisenguyen2203 commented 7 months ago
BigDanTheOne commented 7 months ago

Same

andrekef commented 7 months ago

@michael-s-molina is there any version you would recommend back rolling superset so that arm64 machines are able to build superset fully in docker? FWIW, we are all building superset in development and not production

stefanamaral commented 7 months ago

Same goes for me and I have done exactly the same steps. Trying to push this in my company but is hard when the local thing just doesn't works ...

MariaJSanchezD commented 7 months ago

I have the exact same issue after trying everything to solve the compatibility error with M1

ogr-git commented 7 months ago

same issue on Amazon x86-64 EC2 instance with Ubuntu

Virtualization: amazon
Operating System: Ubuntu 22.04.3 LTS
          Kernel: Linux 6.2.0-1018-aws
    Architecture: x86-64
 Hardware Vendor: Amazon EC2
  Hardware Model: t3.large
rusackas commented 7 months ago

Pinging @betodealmeida and @john-bodley since it sounds related to their recent consolidation efforts.

xiaoshan1213 commented 7 months ago

+1, is there any rollback branch or commit we can use for now?

SbstnErhrdt commented 7 months ago

I resolved the issues by checking out the latest stable release

My system:

Ubuntu 22.04.3 LTS (GNU/Linux 5.15.0-92-generic x86_64)

Steps

git clone https://github.com/apache/superset.git
git checkout tags/3.1.0
docker compose up

If all runs smoothly

docker compose up -d
andrekef commented 7 months ago

@SbstnErhrdt Thanks for your reply. You did not specify what machine you are using.

superset-websocket@0.0.1 start node dist/index.js start config.json file not found {"date":"Wed Feb 07 2024 16:00:22 GMT+0000 (Coordinated Universal Time)","error":{},"exception":true,"level":"error","message":"uncaughtException: Please provide a JWT secret at least 32 bytes long\nError: Please provide a JWT secret at least 32 bytes long\n at Object. (/home/superset-websocket/dist/index.js:76:11)\n at Module._compile (node:internal/modules/cjs/loader:1198:14)\n at Object.Module._extensions..js (node:internal/modules/cjs/loader:1252:10)\n at Module.load (node:internal/modules/cjs/loader:1076:32)\n at Function.Module._load (node:internal/modules/cjs/loader:911:12)\n at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:81:12)\n at node:internal/main/run_main_module:22:47","os":{"loadavg":[29.07,19.59,13.16],"uptime":87469.63},"process":{"argv":["/usr/local/bin/node","/home/superset-websocket/dist/index.js","start"],"cwd":"/home/superset-websocket","execPath":"/usr/local/bin/node","gid":1000,"memoryUsage":{"arrayBuffers":74962,"external":948746,"heapTotal":18386944,"heapUsed":15244176,"rss":0},"pid":22,"uid":1000,"version":"v16.20.2"},"stack":"Error: Please provide a JWT secret at least 32 bytes long\n at Object. (/home/superset-websocket/dist/index.js:76:11)\n at Module._compile (node:internal/modules/cjs/loader:1198:14)\n at Object.Module._extensions..js (node:internal/modules/cjs/loader:1252:10)\n at Module.load (node:internal/modules/cjs/loader:1076:32)\n at Function.Module._load (node:internal/modules/cjs/loader:911:12)\n at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:81:12)\n at node:internal/main/run_main_module:22:47","trace":[{"column":11,"file":"/home/superset-websocket/dist/index.js","function":null,"line":76,"method":null,"native":false},{"column":14,"file":"node:internal/modules/cjs/loader","function":"Module._compile","line":1198,"method":"_compile","native":false},{"column":10,"file":"node:internal/modules/cjs/loader","function":"Module._extensions..js","line":1252,"method":".js","native":false},{"column":32,"file":"node:internal/modules/cjs/loader","function":"Module.load","line":1076,"method":"load","native":false},{"column":12,"file":"node:internal/modules/cjs/loader","function":"Module._load","line":911,"method":"_load","native":false},{"column":12,"file":"node:internal/modules/run_main","function":"Function.executeUserEntryPoint [as runMain]","line":81,"method":"executeUserEntryPoint [as runMain]","native":false},{"column":47,"file":"node:internal/main/run_main_module","function":null,"line":22,"method":null,"native":false}]}

ricokali96 commented 7 months ago

Same problem here

vikramwalia commented 7 months ago

not resolved yet ! what [SbstnErhrdt] said does not work for me.

akshayjain3450 commented 7 months ago

Are we looking for any update soon on this.

yashagv commented 7 months ago

Still Issue Persist. Superset team really need to sort it out, as everyone clone's from master only.

@SbstnErhrdt tags/3.1.0 didn't work for me.

Distributor ID: Ubuntu Description: Ubuntu 20.04.6 LTS Release: 20.04 Codename: focal

SbstnErhrdt commented 7 months ago

@yashagv @andrekef @vikramwalia

my system is

Ubuntu 22.04.3 LTS (GNU/Linux 5.15.0-92-generic x86_64)
akshayjain3450 commented 7 months ago

The config.json file issue with WebSocket is in the master also. It's not just related to tags/3.1.0. The solution for ModuleNotFoundError, for now, is to create a requirements-local.txt in ./docker path and put this sqlglot==20.8.0. Then if you run docker-compose again, you will not find this error. @yashagv @vikramwalia @ricokali96 @andrekef

@rusackas We are missing the dependency in the docker-image being used by the docker-compose. The image seems to be old and does not have the additional changes of base.txt. This needs to be fixed.

yashagv commented 7 months ago

Thanks @akshayjain3450 Resolved 'sqlglot' error after adding sqlglot==20.8.0.

All containers are up, but "http://localhost:8088/superset/welcome/" is continuously loading. When I did port forwarding, it shows me favicon icon. http://localhost:8089/static/assets/images/favicon.png Also not able to connect 5432 database.

Are you able to run, Can you suggest something?

akshayjain3450 commented 7 months ago

Thanks @akshayjain3450

Resolved 'sqlglot' error after adding sqlglot==20.8.0.

All containers are up, but "http://localhost:8088/superset/welcome/" is continuously loading.

When I did port forwarding, it shows me favicon icon. http://localhost:8089/static/assets/images/favicon.png

Also not able to connect 5432 database.

Are you able to run, Can you suggest something?

I found one more error with postgres container where it is looking for a test table which does not exist. What I did is added CREATE DATABASE test command in ./docker/examples-init.sh. And tried again. I am still in the process to confirm if the UI opens up or not.

My two containers are down even after the module fix:

  1. superset_websocket with error: config.json file not found
  2. superset_tests_worker with error: 2024-02-08 16:45:28 2024-02-08 11:15:28,136:ERROR:flask_appbuilder.security.sqla.manager:DB Creation and initialization failed: (psycopg2.OperationalError) connection to server at "localhost" (::1), port 5432 failed: Connection refused 2024-02-08 16:45:28 Is the server running on that host and accepting TCP/IP connections? 2024-02-08 16:45:28 connection to server at "localhost" (127.0.0.1), port 5432 failed: FATAL: database "test" does not exist Also, I am not able to login with username: admin and password: admin

Need help to make this working.

betodealmeida commented 7 months ago

I'm not familiar with how we're building the Docker images, nor with Snarf — does it just mirror an image from Dockerhub?

I was able to repro, and the solution proposed by @akshayjain3450 worked for me, so that seems like an easy workaround for now.

@rusackas do you know how to build and publish a new docker image?

rusackas commented 7 months ago

I think @mistercrunch has the most relevant docker-fu here.

Regarding Scarf (Gateway), if that's what you mean, it's just a proxy, more so than a mirror. It just passes the request through directly to dockerhub and clicks a counter along the way ;)

mistercrunch commented 7 months ago

This should help at least with the confusion around having to change image targets for local arm64 development work on newer Apple silicon -> https://github.com/apache/superset/pull/27055

rajdeepUWE commented 6 months ago

is this the same solution for windows 10 user?

mistercrunch commented 6 months ago
Screenshot 2024-02-12 at 6 21 56 PM

Can't recreate, can someone who had the issue confirmed it's fixed by now?

I tried:

docker-compose pull
docker-compose up

Also

docker compose -f docker-compose-non-dev.yml pull
docker compose -f docker-compose-non-dev.yml up

And also

git checkout 3.0.0
TAG=3.0.0 docker compose -f docker-compose-non-dev.yml pull
TAG=3.0.0 docker compose -f docker-compose-non-dev.yml up

All seemed to work on a recent Macbook M2. I hit localhost:8088 and things were snappy

vikramwalia commented 6 months ago

Still running into issues, it is not even getting to a point where the containers are created. I am following documentation for a 2 min setup. I am going to test this out by creating an .env file at the location below and testing. This is x86 / Ubuntu 22.04.

docker compose up WARN[0000] The "SCARF_ANALYTICS" variable is not set. Defaulting to a blank string. WARN[0000] The "CYPRESS_CONFIG" variable is not set. Defaulting to a blank string. WARN[0000] The "CYPRESS_CONFIG" variable is not set. Defaulting to a blank string. env file /home/superset/docker/.env not found: stat /home/superset/docker/.env: no such file or directory

rajdeepUWE commented 6 months ago

the only error I am facing is mysqlglot error in a loop.


From: Maxime Beauchemin @.> Sent: 13 February 2024 02:29 To: apache/superset @.> Cc: Rajdeep Sarkar @.>; Comment @.> Subject: Re: [apache/superset] ModuleNotFoundError: No module named 'sqlglot' error across many containers (Issue #26997)

Screenshot.2024-02-12.at.6.21.56.PM.png (view on web)https://github.com/apache/superset/assets/487433/21c8b8bb-80e9-4ae2-b0fe-40ad0de2383c

Can't recreate, can someone who had the issue confirmed it's fixed by now?

I tried:

docker-compose pull docker-compose up

Also

docker compose -f docker-compose-non-dev.yml pull docker compose -f docker-compose-non-dev.yml up

And also

git checkout 3.0.0 TAG=3.0.0 docker compose -f docker-compose-non-dev.yml pull TAG=3.0.0 docker compose -f docker-compose-non-dev.yml up

All seemed to work on a recent Macbook M2. I hit localhost:8088 and things were snappy

— Reply to this email directly, view it on GitHubhttps://github.com/apache/superset/issues/26997#issuecomment-1940296342, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A7QFMXN3L2TXL5FZWQ3VQJTYTLFY5AVCNFSM6AAAAABCXDMIUOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNBQGI4TMMZUGI. You are receiving this because you commented.Message ID: @.***>

stefanamaral commented 6 months ago

@rajdeepUWE follow the answer from @akshayjain3450 about creating a requirements-local with the sqlglot.

rajdeepUWE commented 6 months ago

@rajdeepUWE follow the answer from @akshayjain3450 about creating a requirements-local with the sqlglot.

I tried this. Let me get it, i need to make a .txt file name it requirements-local and write sqlglot==20.8.0 and save it in .docker/ or in the Docker directory inside superset (superset/Docker/)? I tried both, I am facing the same issue

mistercrunch commented 6 months ago

mmmh, wondering why I can't recreate here on my local... Are you all on latest master?

sqlglot is properly referenced here -> https://github.com/apache/superset/blob/master/requirements/base.txt#L345 and this file is mounted/referenced in the Dockerfile -> https://github.com/apache/superset/blob/master/Dockerfile#L84-L86

There's a bit of a jumparoo here where requirements files point to one another, the chaing goes requirements/local.txt -> requirements/development.txt -> requirements/base.txt (where sqlglot is referenced)

mistercrunch commented 6 months ago

Wondering if docker caching could be an issue here, where say base.txt changed, but the layer is cached because doesn't think the file has changed. But from my understanding of how docker cache works, if any of the mounted file changed, it's part of the cache key and will invalidate the cache.

akshayjain3450 commented 6 months ago

https://github.com/apache/superset/blob/master/requirements/base.txt#L345

Did we have this dependency in the last stable release 3.1.0 @mistercrunch? Because, after I checkout to that branch I face error building the image. So we have a use there but this dependency I could not find in branch 3.1.0. Can you tell me what I am missing here? How is the public docker image released working and not our custom images without any change?

mistercrunch commented 6 months ago

I don't see any reference of sqlglot when I checkout 3.1.0

$ git checkout 3.1.0
HEAD is now at 0cd2431989 bringin latest from master Dockerfile to allow for multi-platform builds
$ git grep -i sqlglot
Sajawalgujjar381 commented 6 months ago

I am getting the same error. Does anyone resolve the error?

andrekef commented 6 months ago

UPDATE: I am still unable to get this to build in docker nice and clean for M2 machines. 2 other co-workers replicated my issue, also with M1 and M2 machines.

What worked for me and spinned up superset again is creating a venv and using these steps here: https://superset.apache.org/docs/installation/installing-superset-from-scratch/, which by the way, could use some extra clarification for new developers on things such as:

Shivangini-G commented 6 months ago

I faced the same issue while running docker compose up -d So instead I did docker compose -f docker-compose-non-dev.yml up -d and it worked for me.

butuzov commented 6 months ago

Hi, You might want to install it yourself (by editing to .superset/docker/requirements-local.txt).

This dependency was cherry picked and added to master, how ever you all still using outdated images. (For example me who use based on 3.1.0 multiplatform docker image.)

+sqlglot==20.8.0

Cheers

Sajawalgujjar381 commented 6 months ago

Thank you so much. That also worked for me. Thank you.

On Fri, Feb 16, 2024 at 5:48 AM Shivangini-G @.***> wrote:

I faced the same issue while running docker compose up -d So instead I did docker compose -f docker-compose-non-dev.yml up -d and it worked for me.

— Reply to this email directly, view it on GitHub https://github.com/apache/superset/issues/26997#issuecomment-1947796412, or unsubscribe https://github.com/notifications/unsubscribe-auth/BDCEJ7YXOWHH2RGJ4FQRPNLYT3XJXAVCNFSM6AAAAABCXDMIUOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNBXG44TMNBRGI . You are receiving this because you commented.Message ID: @.***>

vikramwalia commented 6 months ago

This solves only for the initial install, however the moment you change anything. Example , pip install snowflake-sqlalchemy it stops working again with errors with ModuleNotFoundError: No module named 'sqlglot'.

Error: While importing 'superset.app', an ImportError was raised: Traceback (most recent call last): File "/usr/local/lib/python3.9/site-packages/flask/cli.py", line 218, in locate_app import(module_name) File "/app/superset/init.py", line 21, in from superset.app import create_app File "/app/superset/app.py", line 24, in from superset.initialization import SupersetAppInitializer File "/app/superset/initialization/init.py", line 35, in from superset.extensions import ( File "/app/superset/extensions/init.py", line 30, in from superset.async_events.async_query_manager import AsyncQueryManager File "/app/superset/async_events/async_query_manager.py", line 26, in from superset.utils.core import get_user_id File "/app/superset/utils/core.py", line 90, in from superset.sql_parse import sanitize_clause File "/app/superset/sql_parse.py", line 29, in from sqlglot import exp, parse, parse_one ModuleNotFoundError: No module named 'sqlglot' Installing local overrides at /app/docker/requirements-local.txt Requirement already satisfied: snowflake-sqlalchemy in /usr/local/lib/python3.9/site-packages (from -r /app/docker/requirements-local.txt (line 1)) (1.5.1)

mistercrunch commented 6 months ago

Digging a bit into this, I understand some things, but I'm not 100% clear on how this is setting or exactly how things are supposed to work here. But few things I know

docker-compose.yml has this ->

x-superset-image: &superset-image apachesuperset.docker.scarf.sh/apache/superset:${TAG:-latest-dev}

Which points to latest-dev which right now is 3.1.0 which does not have sqlglot. So if you git checkout 3.1.0 you can docker-compose up and things line up.

Pointing to master-dev seems more appropriate and much more likely to work, but if you're on arm we don't build that particular variation on these days.

With this PR -> https://github.com/apache/superset/pull/27146 we'll be having multi-platform build for master, so that would make things work better.

Though I'm not sure what's a normal setup for docker-compose here and how the repo is supposed to line up with the images. My best bet is recent master should work on top of a recent image build off of master (as in the master-dev tag), but that seems fragile, there's no guarantee that any particular SHA should match the latest image. I guess if you have a freshly rebases branch and we have a fresh image, things should line up most of the time.

sfirke commented 6 months ago

I see several people here are running the command docker compose up -- my understanding is that if no compose file is specified in this command, Docker defaults to using a compose file called docker-compose.yml if one is present. On the Superset project this point to a potentially-unstable master branch release that should be used for development, not production.

In the install docs we tell people to run docker compose -f docker-compose-non-dev.yml up so that they get a stable / official release image, not the cutting-edge master branch build.

I wonder if we renamed the two files like this:

Would it result in a better experience for new users, because then the default is a stable image? non-dev always sounded clunky to me anyway.

rusackas commented 6 months ago

Interesting question @sfirke - might be a good listserv or town hall question. As a contributor, I prefer to run "dev mode" so the default is sensible. You might be right that a "stable" version is the more sensible default for docker compose, but I'm not certain of it... the current behavior probably helps us find bugs on master a lot faster ;)

sfirke commented 6 months ago

the current behavior probably helps us find bugs on master a lot faster ;)

Quite true! You can tell when master breaks as people come flooding into Slack reporting the same problem. But I don't think using new users as test subjects is probably good for the long-term health of the project. I expect for every person who reports a GitHub issue about broken master branch, many more give up silently and say Superset is not ready for production.

I will put it on the town hall list, good idea.

mistercrunch commented 6 months ago

My expectation (and I think the common use) would be for docker-compose to build off the current branch using the local docker file while mounting the local files so that development can be done. Changing the python files in the repo should change the app, and it assumes youre running 'npm run dev' to build the JS assets.

Now for the other "non-dev"use case you'd build the actual files, no mounts, respecting the current dockerfile.

So all of this being deterministic and respecting the local branch/dockerfile, no divergence from previous layers allowed as it is the case now.

mistercrunch commented 6 months ago

Screenshot_20240222_162159_ChatGPT.jpg

In terms of renaming the files. Here's what I would suggest:

sfirke commented 6 months ago

It sounds like both of those require git and target the master branch/latest commit. Yes we need to meet developer needs there.

But IMO to make it as easy as possible to deploy Superset this way for new users, there should be a docker compose workflow that does not require cloning the repo and points at a stable release. That is how Airflow instructs people to deploy with docker compose, their steps are:

mistercrunch commented 6 months ago

But IMO to make it as easy as possible to deploy Superset this way for new users, there should be a docker compose workflow that does not require cloning the repo and points at a stable release.

From my understanding docker-compose shouldn't be used to productionize applications like Superset and exists to support developer workflows primarily. Helm is the tech that would provide more of the guarantees needed there. Helm should absolutely point to latest-type images by default, but it feels like docker-compose need to build deterministically off the current branch. I guess we could have a docker-compose-latest.yml that points to that image, but then it wouldn't mount anything or be related to the current branch in any other way than that file itself.

Sajawalgujjar381 commented 6 months ago

I faced the same issue while running docker compose up -d So instead I did docker compose -f docker-compose-non-dev.yml up -d and it worked for me.

Yes, I tried your method and it really worked for me also.

geido commented 6 months ago

Thanks everybody. Closing this one for now as we continue validating ways to improve our docker compose setup.

mistercrunch commented 6 months ago

Quick note that we met with the devx sub-team and I said I'd pick up doing a set of improvements for the docker-compose workflows addressing the core issues here. Main idea is referencing the branch's Dockerfile as opposed to a baked image like we do now.