apache / superset

Apache Superset is a Data Visualization and Data Exploration Platform
https://superset.apache.org/
Apache License 2.0
62.3k stars 13.68k forks source link

Windows 10 WSL2 Ubuntu 20.04 LTS $ docker-compose-non-dev.yml up (failed) #16922

Closed JimCallahanOrlando closed 1 year ago

JimCallahanOrlando commented 3 years ago

Windows 10 WSL2 Ubuntu 20.04 LTS $ docker-compose-non-dev.yml up (failed)

How to reproduce the bug

Followed instructions in "Installing Superset Locally Using Docker Compose" "Docker Desktop recently added support for Windows Subsystem for Linux (WSL) 2, which may be another option." https://superset.apache.org/docs/installation/installing-superset-using-docker-compose

  1. Windows 10 / wsl2 / Ubuntu 20.04 LTS
  2. From Windows Terminal
  3. Ubuntu 20.04 tab bash shell
  4. git clone http:s://github.com/apache/superset.git
  5. cd superset
  6. docker-compose -f docker-compose-non-dev.yml up

Expected results

"You should see a wall of logging output from the containers being launched on your machine. Once this output slows, you should have a running instance of Superset on your local machine!"

Actual results

Continious running error loop in Windows Terminal Ubuntu 20.04 tab. From Second Ubuntu 20.04 tab captured "docker ps" container list and environment (edited from neofetch).

Screenshots

File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 608, in do_execute superset_worker | cursor.execute(statement, parameters) superset_worker | sqlalchemy.exc.ProgrammingError: (psycopg2.errors.UndefinedTable) relation "report_schedule" does not exist superset_worker | LINE 2: FROM report_schedule superset_worker | ^ superset_worker | superset_worker | [SQL: SELECT report_schedule.created_on AS report_schedule_created_on, report_schedule.changed_on AS report_schedule_changed_on, report_schedule.id AS report_schedule_id, report_schedule.type AS report_schedule_type, report_schedule.name AS report_schedule_name, report_schedule.description AS report_schedule_description, report_schedule.context_markdown AS report_schedule_context_markdown, report_schedule.active AS report_schedule_active, report_schedule.crontab AS report_schedule_crontab, report_schedule.creation_method AS report_schedule_creation_method, report_schedule.timezone AS report_schedule_timezone, report_schedule.report_format AS report_schedule_report_format, report_schedule.sql AS report_schedule_sql, report_schedule.chart_id AS report_schedule_chart_id, report_schedule.dashboard_id AS report_schedule_dashboard_id, report_schedule.database_id AS report_schedule_database_id, report_schedule.last_eval_dttm AS report_schedule_last_eval_dttm, report_schedule.last_state AS report_schedule_last_state, report_schedule.last_value AS report_schedule_last_value, report_schedule.last_value_row_json AS report_schedule_last_value_row_json, report_schedule.validator_type AS report_schedule_validator_type, report_schedule.validator_config_json AS report_schedule_validator_config_json, report_schedule.log_retention AS report_schedule_log_retention, report_schedule.grace_period AS report_schedule_grace_period, report_schedule.working_timeout AS report_schedule_working_timeout, report_schedule.created_by_fk AS report_schedule_created_by_fk, report_schedule.changed_by_fk AS report_schedule_changed_by_fk superset_worker | FROM report_schedule superset_worker | WHERE report_schedule.active IS true] superset_worker | (Background on this error at: http://sqlalche.me/e/13/f405) superset_app | 127.0.0.1 - - [30/Sep/2021:14:07:05 +0000] "GET /health HTTP/1.1" 200 2 "-" "curl/7.64.0"

Environment

(please complete the following information):

Run from wsl2 Ubuntu bash command line without entering specific container

(which container would you want it run from?).

Command 'python' not found, did you mean:

command 'python3' from deb python3 command 'python' from deb python-is-python3

Command 'node' not found, but can be installed with: sudo apt install nodejs

$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 17e3a42c3496 apache/superset:latest-dev "/usr/bin/docker-ent…" 16 hours ago Up 16 hours (healthy) 0.0.0.0:8088->8088/tcp, :::8088->8088/tcp superset_app 53a11da454e0 apache/superset:latest-dev "/usr/bin/docker-ent…" 16 hours ago Up 16 hours (unhealthy) 8088/tcp superset_worker e208351c7e4b apache/superset:latest-dev "/usr/bin/docker-ent…" 16 hours ago Up 16 hours (unhealthy) 8088/tcp superset_worker_beat bc84f45637be redis:latest "docker-entrypoint.s…" 16 hours ago Up 16 hours 6379/tcp superset_cache fbc72e5f019f postgres:10 "docker-entrypoint.s…" 16 hours ago Up 16 hours 5432/tcp superset_db

$ docker -v Docker version 20.10.8, build 3967b7d

OS: Ubuntu 20.04.3 LTS on Windows 10 x86_64 Kernel: 5.10.16.3-microsoft-standard-WSL2 Uptime: 8 days, 17 hours, 26 mins Packages: 719 (dpkg) Shell: bash 5.0.17 Terminal: /dev/pts/4 CPU: Intel i7-8550U (8) @ 1.991GHz Memory: 2584MiB / 12692MiB

Checklist

Make sure to follow these steps before submitting your issue - thank you!

WSL2_attempt_to_run_docker-compose_superset_from_cloned_git

File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 608, in do_execute superset_worker | cursor.execute(statement, parameters) superset_worker | sqlalchemy.exc.ProgrammingError: (psycopg2.errors.UndefinedTable) relation "report_schedule" does not exist superset_worker | LINE 2: FROM report_schedule superset_worker | ^ superset_worker | superset_worker | [SQL: SELECT report_schedule.created_on AS report_schedule_created_on, report_schedule.changed_on AS report_schedule_changed_on, report_schedule.id AS report_schedule_id, report_schedule.type AS report_schedule_type, report_schedule.name AS report_schedule_name, report_schedule.description AS report_schedule_description, report_schedule.context_markdown AS report_schedule_context_markdown, report_schedule.active AS report_schedule_active, report_schedule.crontab AS report_schedule_crontab, report_schedule.creation_method AS report_schedule_creation_method, report_schedule.timezone AS report_schedule_timezone, report_schedule.report_format AS report_schedule_report_format, report_schedule.sql AS report_schedule_sql, report_schedule.chart_id AS report_schedule_chart_id, report_schedule.dashboard_id AS report_schedule_dashboard_id, report_schedule.database_id AS report_schedule_database_id, report_schedule.last_eval_dttm AS report_schedule_last_eval_dttm, report_schedule.last_state AS report_schedule_last_state, report_schedule.last_value AS report_schedule_last_value, report_schedule.last_value_row_json AS report_schedule_last_value_row_json, report_schedule.validator_type AS report_schedule_validator_type, report_schedule.validator_config_json AS report_schedule_validator_config_json, report_schedule.log_retention AS report_schedule_log_retention, report_schedule.grace_period AS report_schedule_grace_period, report_schedule.working_timeout AS report_schedule_working_timeout, report_schedule.created_by_fk AS report_schedule_created_by_fk, report_schedule.changed_by_fk AS report_schedule_changed_by_fk superset_worker | FROM report_schedule superset_worker | WHERE report_schedule.active IS true] superset_worker | (Background on this error at: http://sqlalche.me/e/13/f405) superset_app | 127.0.0.1 - - [30/Sep/2021:14:07:05 +0000] "GET /health HTTP/1.1" 200 2 "-" "curl/7.64.0"

Additional context

Add any other context about the problem here.

JimCallahanOrlando commented 3 years ago

Here is a more complete command log:

$ docker-compose -f docker-compose-non-dev.yml up [+] Running 7/7 ⠿ Network superset_default Created 1.1s ⠿ Container superset_cache Created 2.5s ⠿ Container superset_db Created 2.4s ⠿ Container superset_app Created 2.6s ⠿ Container superset_worker Created 2.7s ⠿ Container superset_worker_beat Created 2.7s ⠿ Container superset_init Created 2.6s Attaching to superset_app, superset_cache, superset_db, superset_init, superset_worker, superset_worker_beat superset_db | superset_db | PostgreSQL Database directory appears to contain a database; Skipping initialization superset_db | superset_cache | 1:C 01 Oct 2021 16:40:50.554 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo superset_cache | 1:C 01 Oct 2021 16:40:50.555 # Redis version=6.2.5, bits=64, commit=00000000, modified=0, pid=1, just started superset_cache | 1:C 01 Oct 2021 16:40:50.555 # Warning: no config file specified, using the default config. In order to specify a config file use redis-server /path/to/redis.conf superset_cache | 1:M 01 Oct 2021 16:40:50.556 monotonic clock: POSIX clock_gettime superset_cache | 1:M 01 Oct 2021 16:40:50.560 Running mode=standalone, port=6379. superset_cache | 1:M 01 Oct 2021 16:40:50.560 # Server initialized superset_cache | 1:M 01 Oct 2021 16:40:50.560 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect. superset_cache | 1:M 01 Oct 2021 16:40:50.583 Loading RDB produced by version 6.2.5 superset_cache | 1:M 01 Oct 2021 16:40:50.584 RDB age 72121 seconds superset_cache | 1:M 01 Oct 2021 16:40:50.584 RDB memory usage when created 8.99 Mb superset_cache | 1:M 01 Oct 2021 16:40:51.826 DB loaded from disk: 1.243 seconds superset_cache | 1:M 01 Oct 2021 16:40:51.826 * Ready to accept connections superset_db | 2021-10-01 16:40:51.951 UTC [1] LOG: listening on IPv4 address "0.0.0.0", port 5432 superset_db | 2021-10-01 16:40:51.951 UTC [1] LOG: listening on IPv6 address "::", port 5432 superset_db | 2021-10-01 16:40:52.305 UTC [1] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432" superset_db | 2021-10-01 16:40:53.076 UTC [27] LOG: database system was shut down at 2021-09-30 20:38:50 UTC superset_db | 2021-10-01 16:40:53.740 UTC [1] LOG: database system is ready to accept connections superset_init | Skipping local overrides superset_init | superset_init | ###################################################################### superset_init | superset_init | superset_init | Init Step 1/4 [Starting] -- Applying DB migrations superset_init | superset_init | superset_init | ###################################################################### superset_init | superset_worker_beat | Skipping local overrides superset_worker_beat | Starting Celery beat... superset_worker | Skipping local overrides superset_worker | Starting Celery worker... superset_app | Skipping local overrides superset_app | Starting web app... superset_app | [2021-10-01 16:41:04 +0000] [8] [INFO] Starting gunicorn 20.0.4 superset_app | [2021-10-01 16:41:04 +0000] [8] [INFO] Listening at: http://0.0.0.0:8088 (8) superset_app | [2021-10-01 16:41:04 +0000] [8] [INFO] Using worker: gthread superset_app | [2021-10-01 16:41:04 +0000] [11] [INFO] Booting worker with pid: 11 superset_app | logging was configured successfully superset_app | 2021-10-01 16:41:10,802:INFO:superset.utils.logging_configurator:logging was configured successfully superset_app | 2021-10-01 16:41:10,813:INFO:root:Configured event logger of type <class 'superset.utils.log.DBEventLogger'> superset_worker_beat | logging was configured successfully superset_worker_beat | 2021-10-01 16:41:10,813:INFO:superset.utils.logging_configurator:logging was configured successfully superset_worker | logging was configured successfully superset_worker | 2021-10-01 16:41:10,839:INFO:superset.utils.logging_configurator:logging was configured successfully superset_worker_beat | 2021-10-01 16:41:10,842:INFO:root:Configured event logger of type <class 'superset.utils.log.DBEventLogger'> superset_worker | 2021-10-01 16:41:10,849:INFO:root:Configured event logger of type <class 'superset.utils.log.DBEventLogger'> superset_init | logging was configured successfully superset_init | 2021-10-01 16:41:12,096:INFO:superset.utils.logging_configurator:logging was configured successfully superset_init | 2021-10-01 16:41:12,104:INFO:root:Configured event logger of type <class 'superset.utils.log.DBEventLogger'> superset_init | /usr/local/lib/python3.7/site-packages/flask_caching/init.py:202: UserWarning: Flask-Caching: CACHE_TYPE is set to null, caching is effectively disabled. superset_init | "Flask-Caching: CACHE_TYPE is set to null, " superset_init | INFO [alembic.runtime.migration] Context impl PostgresqlImpl. superset_init | INFO [alembic.runtime.migration] Will assume transactional DDL. superset_worker_beat | /usr/local/lib/python3.7/site-packages/flask_caching/init.py:202: UserWarning: Flask-Caching: CACHE_TYPE is set to null, caching is effectively disabled. superset_worker_beat | "Flask-Caching: CACHE_TYPE is set to null, " superset_worker_beat | [2021-10-01 16:41:16,721: INFO/MainProcess] beat: Starting... superset_worker_beat | [2021-10-01 16:41:17,307: INFO/MainProcess] Scheduler: Sending due task reports.prune_log (reports.prune_log) superset_worker_beat | [2021-10-01 16:41:17,518: INFO/MainProcess] Scheduler: Sending due task reports.scheduler (reports.scheduler) superset_init | INFO [alembic.runtime.migration] Running upgrade -> 4e6a06bad7a8, Init superset_worker | /usr/local/lib/python3.7/site-packages/flask_caching/init.py:202: UserWarning: Flask-Caching: CACHE_TYPE is set to null, caching is effectively disabled. superset_worker | "Flask-Caching: CACHE_TYPE is set to null, " superset_worker | /usr/local/lib/python3.7/site-packages/celery/platforms.py:801: RuntimeWarning: You're running the worker with superuser privileges: this is superset_worker | absolutely not recommended! superset_worker | superset_worker | Please specify a different user using the --uid option. superset_worker | superset_worker | User information: uid=0 euid=0 gid=0 egid=0 superset_worker | superset_worker | uid=uid, euid=euid, gid=gid, egid=egid, superset_worker | [2021-10-01 16:41:19,079: INFO/MainProcess] Connected to redis://redis:6379/0 superset_worker | [2021-10-01 16:41:19,087: INFO/MainProcess] mingle: searching for neighbors superset_worker | [2021-10-01 16:41:20,107: INFO/MainProcess] mingle: all alone superset_worker | [2021-10-01 16:41:20,132: INFO/MainProcess] celery@c785757d9879 ready. superset_worker | [2021-10-01 16:41:20,138: INFO/MainProcess] Received task: reports.prune_log[8ee96b8a-643e-42c2-ba47-bbed7efb56c6] superset_worker | [2021-10-01 16:41:20,141: INFO/MainProcess] Received task: reports.scheduler[3ad19f0e-23d7-4322-bbcd-8558baa19f9e] superset_db | 2021-10-01 16:41:20.243 UTC [39] ERROR: relation "report_schedule" does not exist at character 1627 superset_db | 2021-10-01 16:41:20.243 UTC [39] STATEMENT: SELECT report_schedule.created_on AS report_schedule_created_on, report_schedule.changed_on AS report_schedule_changed_on, report_schedule.id AS report_schedule_id, report_schedule.type AS report_schedule_type, report_schedule.name AS report_schedule_name, report_schedule.description AS report_schedule_description, report_schedule.context_markdown AS report_schedule_context_markdown, report_schedule.active AS report_schedule_active, report_schedule.crontab AS report_schedule_crontab, report_schedule.creation_method AS report_schedule_creation_method, report_schedule.timezone AS report_schedule_timezone, report_schedule.report_format AS report_schedule_report_format, report_schedule.sql AS report_schedule_sql, report_schedule.chart_id AS report_schedule_chart_id, report_schedule.dashboard_id AS report_schedule_dashboard_id, report_schedule.database_id AS report_schedule_database_id, report_schedule.last_eval_dttm AS report_schedule_last_eval_dttm, report_schedule.last_state AS report_schedule_last_state, report_schedule.last_value AS report_schedule_last_value, report_schedule.last_value_row_json AS report_schedule_last_value_row_json, report_schedule.validator_type AS report_schedule_validator_type, report_schedule.validator_config_json AS report_schedule_validator_config_json, report_schedule.log_retention AS report_schedule_log_retention, report_schedule.grace_period AS report_schedule_grace_period, report_schedule.working_timeout AS report_schedule_working_timeout, report_schedule.created_by_fk AS report_schedule_created_by_fk, report_schedule.changed_by_fk AS report_schedule_changed_by_fk superset_db | FROM report_schedule superset_db | 2021-10-01 16:41:20.246 UTC [40] ERROR: relation "report_schedule" does not exist at character 1627

[REPEATS "report_schedule" does not exist set of error messages]

srinify commented 3 years ago

hey there Jim Callahan, have you run the db migrations? I believe it's superset db upgrade

I will also mention that Superset isn't tested / confirmed as working on Windows (even with WSL) so just heads up

JimCallahanOrlando commented 3 years ago

I gave up on git/docker-compose and switched to plain Dockerhub image (latest) which worked fine. p @srinify , didn't make it to Superset prompt died when setting up worker thread.

Only issue so far is that Superset running from Docker image does not give version number (Setting/About/Version: 0.0.0dev), so I don't know whether features described in YouTube as associated with specific version will work. https://hub.docker.com/r/apache/superset/tags?page=1&ordering=last_updated

JimCallahanOrlando commented 3 years ago

@srinify , I should clarify the Superset Docker image works fine, if you follow the step by step instructions on DockerHub (which includes the step for database migration).

I installed Docker-Desktop on Windows 10 and then ran "docker pull" from WSL2 Ubuntu Linux command line and was able to access Superset from Chrome web browser running in Windows using the localhost URL.

esko22 commented 2 years ago

@srinify - Howdy! I am having the same issue as Jim, however, I am running on Mac. I was able to term into the app container and run the superset db upgrade command as you mentioned above. It produces the same errors as seen when running compose up on either dev or non-dev configs.

There seems to be another issue upstream of the missing report_schedule relationship.

  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1276, in _execute_context
    self.dialect.do_execute(
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 608, in do_execute
    cursor.execute(statement, parameters)
psycopg2.errors.UndefinedTable: relation "ab_permission_view_role" does not exist
LINE 2: FROM ab_role LEFT OUTER JOIN (ab_permission_view_role AS ab_...

[SQL: SELECT ab_role.id AS ab_role_id, ab_role.name AS ab_role_name, ab_permission_view_1.id AS ab_permission_view_1_id, ab_permission_view_1.permission_id AS ab_permission_view_1_permission_id, ab_permission_view_1.view_menu_id AS ab_permission_view_1_view_menu_id 
FROM ab_role LEFT OUTER JOIN (ab_permission_view_role AS ab_permission_view_role_1 JOIN ab_permission_view AS ab_permission_view_1 ON ab_permission_view_1.id = ab_permission_view_role_1.permission_view_id) ON ab_role.id = ab_permission_view_role_1.role_id]```

I attached to the db instance and it appears that the ab_permission_view_role has not been created.

The following tables where created -

ab_permission
ab_permission_view
ab_register_user
ab_role
ab_user
ab_view_menu
esko22 commented 2 years ago

In case anyone else comes across this, looks like this issue falls square on me. I did not double check my docker memory setting as advised by the docs. I was allocating 4GB vs 6GB. It appears the db init crapped out and did not finish creating the schema, which is now 57 tables vs 6.

I changed the setting, nuked the db volume and restarted the stack successfully.

srinify commented 2 years ago

@esko22 AHH yeah hmm we added this to the documentation a while back to help avoid folks getting tripped up here! Glad this worked for you

martimors commented 2 years ago

I got this working on Windows by increasing WSL2 memory allocation to 16GB, deleting all dangling images, reseting kubernetes cluster, deleting all volumes, and restarting computer before installing again with helm (all defaults).

JimCallahanOrlando commented 2 years ago

Options for .wslconfig Section label: [wsl2] ...

memory | size | 50% of total memory on Windows or 8GB, whichever is less; on builds before 20175: 80% of your total memory on Windows | How much memory to assign to the WSL 2 VM. https://docs.microsoft.com/en-us/windows/wsl/wsl-config#options-for-wslconfig

JimCallahanOrlando commented 2 years ago

Windows:

NOTE: Windows is currently not a supported environment for Superset installation.

For Windows users, the best option may be to install an Ubuntu Desktop VM via VirtualBox and proceed with the Docker on Linux instructions inside of that VM. [UPDATE: see .wslconfig post in GitHub]

It is recommended to assign at least 8GB of RAM to the virtual machine as well as provisioning a hard drive of at least 40GB, so that there will be enough space for both the OS and all of the required dependencies. https://apache-superset.readthedocs.io/en/latest/installation.html

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. For admin, please label this issue .pinned to prevent stale bot from closing the issue.

rusackas commented 1 year ago

Closing this since a) The issue seems pretty stale b) The original author seems to have found a path forward (along with others on the thread) c) We don't officially support Windows, so support resources here are quite limited in scope.