ml-tooling / ml-workspace

🛠 All-in-one web-based IDE specialized for machine learning and data science.
https://mltooling.org/ml-workspace
Apache License 2.0
3.45k stars 452 forks source link

Jupyter crashes in workspace, workspace-r and workspace-spark, but not in workspace-minimal #71

Closed IamLunchbox closed 3 years ago

IamLunchbox commented 3 years ago

Bug: Workspace flavors workspace, workspace-r and workspace-spark are unable to run due to jupyter-notebook crashing (sigill). Workspace-minimal works fine though.

Expected behaviour: All flavors should run in docker without any difficulties, jupyter notebook should not crash

Steps to reproduce the issue / Technical details: I run ml-workspace in docker-compose with the following compose-file:

version: "3.2"
services:
  ml-workspace:
#   image: mltooling/ml-workspace-r:0.12.1
#   image: mltooling/ml-workspace-spark:0.12.1
    image: mltooling/ml-workspace-minimal:0.12.1 #works
    environment:
    - AUTHENTICATE_VIA_JUPYTER=
    - SHARED_LINKS_ENABLED=false
    - SHUTDOWN_INACTIVE_KERNELS=1800
    restart: unless-stopped
    volumes:
      - ml-workspace:/workspace
      - ./projects:/workspace/Projects
    ports:
      - 8121:8080
volumes:
  ml-workspace:

Using docker-compose up creates the following output:

[...] #everthing starting fine
ml-workspace_1  | 2021/01/14 22:48:17 [error] 114#0: *6 connect() failed (111: Connection refused) while connecting to upstream, client: 127.0.0.1, server: , request: "GET /login?next=%2Ftree%2FProjects HTTP/1.0", upstream: "http://127.0.0.1:8090/login?next=%2Ftree%2FProjects", host: "?.?.com"
ml-workspace_1  | 2021-01-14 22:48:18,697 INFO success: jupyter entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
ml-workspace_1  | 2021-01-14 22:48:20,580 INFO exited: jupyter (terminated by SIGILL (core dumped); not expected)
ml-workspace_1  | 2021-01-14 22:48:21,583 INFO spawned: 'jupyter' with pid 814
ml-workspace_1  | 2021/01/14 22:48:21 [error] 114#0: *8 connect() failed (111: Connection refused) while connecting to upstream, client: 127.0.0.1, server: , request: "GET / HTTP/1.0", upstream: "http://127.0.0.1:8090/", host: ""?.?.com"
ml-workspace_1  | 2021-01-14 22:48:22,958 INFO success: jupyter entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
ml-workspace_1  | 2021-01-14 22:48:24,872 INFO exited: jupyter (terminated by SIGILL (core dumped); not expected)
ml-workspace_1  | 2021-01-14 22:48:25,874 INFO spawned: 'jupyter' with pid 819
#repeat forever

The issue starts when I connect with either firefox or edge (chromium version)...

ml-workspace_1 | 2021/01/14 23:19:43 [error] 130#0: *16 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 127.0.0.1, server: , request: "GET / HTTP/1.0", upstream: "http://127.0.0.1:8090/", host: "url.tld", referrer: "url.tld" ...and Jupyter notebook dies. ml-workspace_1 | 2021-01-14 23:19:43,871 INFO exited: jupyter (terminated by SIGILL (core dumped); not expected)

Since I use several web applications on my server, I run a reverse nginx-proxy. I am aware of the issue with cpu's not supporting SSE24, but my virtual-server's cpu supports it. The following file manages incoming data for ml-workspace:

server {
    listen 80;
    server_name ?.?.com www."?.?.com;
    return 301 https://$host$request_uri;
}

server {
  listen 443 ssl http2;
  server_name "?.?.com www."?.?.com;

  # Specify SSL config if using a shared one.
  include conf.d/ssl/ssl.conf;
  include conf.d/geoip/geoip.conf;
  ssl_certificate /etc/letsencrypt/live/?/fullchain.pem;
  ssl_certificate_key /etc/letsencrypt/live/?/privkey.pem;

  # Allow large attachments
  client_max_body_size 128M;

  location / {
    proxy_pass http://127.0.0.1:8121;
    proxy_set_header Host $host;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection "Upgrade";
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;
    auth_basic "Private";
    auth_basic_user_file /;
  }

}

Technical details:

Additional context: I would've suspected my system being broken, but I actually don't know why the minimal-flavour works then. I hope you have more insight in the jupyter-packages and an idea, why just the larger flavours break jupyter-notebook.

Edit After looking at the nginx.conf file within ml-worspace it seems, that 127.0.0.1:8090 is used by pling for authentication:

 -- only authenticate via jupyter ping method if it is activated (not false)
                if "password" ~= "false" then
                    local res, error = http_connection:request_uri(
                        "http://127.0.0.1:8090/tooling/ping",

If pling is a part of jupyter-notebook, it could be an upstream error related to the jupyter-notebook authentication mechanism.

github-actions[bot] commented 3 years ago

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 14 days