canonical / kubeflow-rocks

Rocks for Kubeflow components
Apache License 2.0
0 stars 11 forks source link

notebook server ROCKs misconfigure the base_url #54

Open NohaIhab opened 1 year ago

NohaIhab commented 1 year ago

In upstream notebook server images, the base_url is set to an environment variable ${NB_PREFIX} here. This variable should be used in the command to start the Pebble service, but it is currently hardcoded to "/" in the ROCK definitions, see here as an example.

As a consequence, the server gets started at the wrong URL and we cannot connect to notebook servers.

Logs

2023-09-29T08:58:55.888Z [pebble] Started daemon.
2023-09-29T08:58:55.903Z [pebble] POST /v1/services 13.734158ms 202
2023-09-29T08:58:55.903Z [pebble] Started default services with change 1.
2023-09-29T08:58:55.917Z [pebble] Service "jupyter" starting: ./jupyter lab --notebook-dir="/home/jovyan" --ip=0.0.0.0 --no-browser --port=8888 --ServerApp.token="" --ServerApp.password="" --ServerApp.allow_origin="*" --ServerApp.base_url="/" --ServerApp.authenticate_prometheus=False
2023-09-29T08:59:01.757Z [jupyter] [I 2023-09-29 08:59:01.756 ServerApp] jupyter_server_mathjax | extension was successfully linked.
2023-09-29T08:59:01.764Z [jupyter] [I 2023-09-29 08:59:01.764 ServerApp] jupyterlab | extension was successfully linked.
2023-09-29T08:59:01.764Z [jupyter] [I 2023-09-29 08:59:01.764 ServerApp] jupyterlab_git | extension was successfully linked.
2023-09-29T08:59:01.772Z [jupyter] [I 2023-09-29 08:59:01.772 ServerApp] nbclassic | extension was successfully linked.
2023-09-29T08:59:01.773Z [jupyter] [I 2023-09-29 08:59:01.773 ServerApp] nbdime | extension was successfully linked.
2023-09-29T08:59:03.707Z [jupyter] [I 2023-09-29 08:59:03.707 ServerApp] notebook_shim | extension was successfully linked.
2023-09-29T08:59:03.888Z [jupyter] [W 2023-09-29 08:59:03.888 ServerApp] All authentication is disabled.  Anyone who can connect to this server will be able to run code.
2023-09-29T08:59:03.904Z [jupyter] [I 2023-09-29 08:59:03.904 ServerApp] notebook_shim | extension was successfully loaded.
2023-09-29T08:59:03.905Z [jupyter] [I 2023-09-29 08:59:03.905 ServerApp] jupyter_server_mathjax | extension was successfully loaded.
2023-09-29T08:59:03.927Z [jupyter] [I 2023-09-29 08:59:03.927 LabApp] JupyterLab extension loaded from /opt/conda/lib/python3.8/site-packages/jupyterlab
2023-09-29T08:59:03.928Z [jupyter] [I 2023-09-29 08:59:03.927 LabApp] JupyterLab application directory is /opt/conda/share/jupyter/lab
2023-09-29T08:59:03.933Z [jupyter] [I 2023-09-29 08:59:03.933 ServerApp] jupyterlab | extension was successfully loaded.
2023-09-29T08:59:03.938Z [jupyter] [I 2023-09-29 08:59:03.937 ServerApp] jupyterlab_git | extension was successfully loaded.
2023-09-29T08:59:03.951Z [jupyter] [I 2023-09-29 08:59:03.951 ServerApp] nbclassic | extension was successfully loaded.
2023-09-29T08:59:04.408Z [jupyter] [I 2023-09-29 08:59:04.407 ServerApp] nbdime | extension was successfully loaded.
2023-09-29T08:59:04.409Z [jupyter] [I 2023-09-29 08:59:04.408 ServerApp] Serving notebooks from local directory: /home/jovyan
2023-09-29T08:59:04.409Z [jupyter] [I 2023-09-29 08:59:04.409 ServerApp] Jupyter Server 1.24.0 is running at:
2023-09-29T08:59:04.409Z [jupyter] [I 2023-09-29 08:59:04.409 ServerApp] http://nb1-0:8888/lab
2023-09-29T08:59:04.409Z [jupyter] [I 2023-09-29 08:59:04.409 ServerApp]  or http://127.0.0.1:8888/lab
2023-09-29T08:59:04.409Z [jupyter] [I 2023-09-29 08:59:04.409 ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
2023-09-29T10:01:22.684Z [jupyter] [I 2023-09-29 10:01:22.684 ServerApp] 302 GET /notebook/admin/nb1/ (127.0.0.6) 0.98ms
2023-09-29T10:01:24.332Z [jupyter] [W 2023-09-29 10:01:24.331 ServerApp] 404 GET /notebook/admin/nb1/api/contents (127.0.0.6) 34.63ms referer=http://10.64.140.43.nip.io/notebook/admin/nb1?ns=admin
2023-09-29T10:20:39.415Z [jupyter] [I 2023-09-29 10:20:39.415 ServerApp] 302 GET /notebook/admin/nb1/ (127.0.0.6) 0.54ms
2023-09-29T10:20:41.130Z [jupyter] [W 2023-09-29 10:20:41.130 ServerApp] 404 GET /notebook/admin/nb1/api/contents (127.0.0.6) 1.96ms referer=http://10.64.140.43.nip.io/notebook/admin/nb1?ns=admin
2023-09-29T12:03:26.530Z [jupyter] [I 2023-09-29 12:03:26.530 ServerApp] 302 GET /notebook/admin/nb1/ (127.0.0.6) 0.76ms
2023-09-29T12:03:28.879Z [jupyter] [W 2023-09-29 12:03:28.879 ServerApp] 404 GET /notebook/admin/nb1/api/contents (127.0.0.6) 1.31ms referer=http://10.64.140.43.nip.io/notebook/admin/nb1?ns=admin
2023-09-29T12:06:10.033Z [jupyter] [I 2023-09-29 12:06:10.033 ServerApp] 302 GET /notebook/admin/nb1/ (127.0.0.6) 0.61ms
2023-09-29T12:06:12.114Z [jupyter] [W 2023-09-29 12:06:12.114 ServerApp] 404 GET /notebook/admin/nb1/api/contents (127.0.0.6) 1.61ms referer=http://10.64.140.43.nip.io/notebook/admin/nb1?ns=admin
2023-09-29T12:06:21.711Z [jupyter] [W 2023-09-29 12:06:21.711 ServerApp] 404 GET /notebook/admin/nb1/api/contents (127.0.0.6) 2.72ms referer=http://10.64.140.43.nip.io/notebook/admin/nb1?ns=admin
2023-09-29T12:08:11.008Z [jupyter] [W 2023-09-29 12:08:11.008 ServerApp] 404 GET /notebook/admin/nb1/api/contents (127.0.0.6) 2.45ms referer=http://10.64.140.43.nip.io/notebook/admin/nb1?ns=admin
2023-09-29T12:11:14.152Z [jupyter] [I 2023-09-29 12:11:14.152 ServerApp] 302 GET /notebook/admin/nb1/?ns=admin (127.0.0.6) 0.60ms
2023-09-29T12:11:17.857Z [jupyter] [W 2023-09-29 12:11:17.857 ServerApp] 404 GET /notebook/admin/nb1/api/contents (127.0.0.6) 1.94ms referer=http://10.64.140.43.nip.io/notebook/admin/nb1?ns=admin

To reproduce:

  1. deploy charmed kubeflow 1.7/edge (this channel has the ROCKs integrated)
  2. access the dashboard
  3. create a notebook and try to connect to it
i-chvets commented 1 year ago

Modifying service command ROCK as follows does help. Environment variables are not expanded in the command and service fails to start:

$ microk8s.kubectl -n test logs notebook-pod
2023-10-03T16:10:11.394Z [pebble] Started daemon.
2023-10-03T16:10:11.430Z [pebble] Service "jupyter" starting: ./jupyter lab --notebook-dir=${HOME} --ip=0.0.0.0 --no-browser --port=8888 --ServerApp.token="" --ServerApp.password="" --ServerApp.allow_origin="*" --ServerApp.base_url=${NB_PREFIX} --ServerApp.authenticate_prometheus="False"
2023-10-03T16:10:11.431Z [pebble] POST /v1/services 33.467774ms 202
2023-10-03T16:10:11.431Z [pebble] Started default services with change 1.
2023-10-03T16:10:13.233Z [jupyter] [C 2023-10-03 16:10:13.232 ServerApp] Bad config encountered during initialization: No such directory: ''/opt/conda/bin/${HOME}''
2023-10-03T16:10:13.270Z [pebble] Service "jupyter" stopped unexpectedly with code 1

This needs to be resolved prior to experimenting with passing environment variables set in Pod to Pebble/service command.

i-chvets commented 1 year ago

Used bash -c to wrap around command:

    command: bash -c './jupyter lab --notebook-dir=${HOME} --ip=0.0.0.0 --no-browser --port=8888 --ServerApp.token=\"\" --ServerApp.password=\"\" --ServerApp.allow_origin=\"*\" --ServerApp.base_url=${NB_PREFIX} --ServerApp.authenticate_prometheus=False'

When this ROCK is deployed, service starts correctly without errors, because it is relying on environment variables specified in service (note that environment variables are not expanded in the logs, but values are passed to command correctly which is indicated by the logs):

$ microk8s.kubectl -n test logs notebook-pod
2023-10-03T19:52:52.677Z [pebble] Started daemon.
2023-10-03T19:52:52.691Z [pebble] POST /v1/services 13.452901ms 202
2023-10-03T19:52:52.691Z [pebble] Started default services with change 1.
2023-10-03T19:52:52.705Z [pebble] Service "jupyter" starting: bash -c './jupyter lab --notebook-dir=${HOME} --ip=0.0.0.0 --no-browser --port=8888 --ServerApp.token=\"\" --ServerApp.password=\"\" --ServerApp.allow_origin=\"*\" --ServerApp.base_url=${NB_PREFIX} --ServerApp.authenticate_prometheus=False'
2023-10-03T19:52:54.033Z [jupyter] [I 2023-10-03 19:52:54.033 ServerApp] jupyter_server_mathjax | extension was successfully linked.
2023-10-03T19:52:54.035Z [jupyter] [I 2023-10-03 19:52:54.035 ServerApp] jupyterlab | extension was successfully linked.
2023-10-03T19:52:54.035Z [jupyter] [I 2023-10-03 19:52:54.035 ServerApp] jupyterlab_git | extension was successfully linked.
2023-10-03T19:52:54.037Z [jupyter] [I 2023-10-03 19:52:54.037 ServerApp] nbclassic | extension was successfully linked.
2023-10-03T19:52:54.037Z [jupyter] [I 2023-10-03 19:52:54.037 ServerApp] nbdime | extension was successfully linked.
2023-10-03T19:52:54.038Z [jupyter] [I 2023-10-03 19:52:54.037 ServerApp] Writing Jupyter server cookie secret to /home/jovyan/.local/share/jupyter/runtime/jupyter_cookie_secret
2023-10-03T19:52:54.477Z [jupyter] [I 2023-10-03 19:52:54.477 ServerApp] notebook_shim | extension was successfully linked.
2023-10-03T19:52:54.510Z [jupyter] [I 2023-10-03 19:52:54.510 ServerApp] notebook_shim | extension was successfully loaded.
2023-10-03T19:52:54.510Z [jupyter] [I 2023-10-03 19:52:54.510 ServerApp] jupyter_server_mathjax | extension was successfully loaded.
2023-10-03T19:52:54.511Z [jupyter] [I 2023-10-03 19:52:54.511 LabApp] JupyterLab extension loaded from /opt/conda/lib/python3.8/site-packages/jupyterlab
2023-10-03T19:52:54.511Z [jupyter] [I 2023-10-03 19:52:54.511 LabApp] JupyterLab application directory is /opt/conda/share/jupyter/lab
2023-10-03T19:52:54.512Z [jupyter] [I 2023-10-03 19:52:54.512 ServerApp] jupyterlab | extension was successfully loaded.
2023-10-03T19:52:54.513Z [jupyter] [I 2023-10-03 19:52:54.513 ServerApp] jupyterlab_git | extension was successfully loaded.
2023-10-03T19:52:54.516Z [jupyter] [I 2023-10-03 19:52:54.515 ServerApp] nbclassic | extension was successfully loaded.
2023-10-03T19:52:54.599Z [jupyter] [I 2023-10-03 19:52:54.599 ServerApp] nbdime | extension was successfully loaded.
2023-10-03T19:52:54.600Z [jupyter] [I 2023-10-03 19:52:54.599 ServerApp] Serving notebooks from local directory: /home/jovyan
2023-10-03T19:52:54.600Z [jupyter] [I 2023-10-03 19:52:54.599 ServerApp] Jupyter Server 1.24.0 is running at:
2023-10-03T19:52:54.600Z [jupyter] [I 2023-10-03 19:52:54.600 ServerApp] http://notebook-pod:8888/lab?token=...
2023-10-03T19:52:54.600Z [jupyter] [I 2023-10-03 19:52:54.600 ServerApp]  or http://127.0.0.1:8888/lab?token=...
2023-10-03T19:52:54.600Z [jupyter] [I 2023-10-03 19:52:54.600 ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).

Created Pod with environment variables that have invalid values:

apiVersion: v1
kind: Pod
metadata:
  name: notebook-pod
spec:
  containers:
  - name: jupyter  
    image: charmedkubeflow/jupyter-pytorch-test:v1.7.0_20.04_1 
    env:
    - name: HOME
      value: "some/invalid/home"
    - name: NB_PREFIX
      value: "some/invalid/prefix"
    ports:
    - containerPort: 8888

When deploying this Pod the expectation is that environment variables will be passed to command of service in the ROCK which will cause workload to fail, this did not however happened. Values of enviroment variables from the service were taken:

$ microk8s.kubectl -n test apply -f notebook-pod.yaml
2023-10-03T19:59:15.793Z [pebble] Started daemon.
2023-10-03T19:59:15.810Z [pebble] POST /v1/services 16.47018ms 202
2023-10-03T19:59:15.811Z [pebble] Started default services with change 1.
2023-10-03T19:59:15.826Z [pebble] Service "jupyter" starting: bash -c './jupyter lab --notebook-dir=${HOME} --ip=0.0.0.0 --no-browser --port=8888 --ServerApp.token=\"\" --ServerApp.password=\"\" --ServerApp.allow_origin=\"*\" --ServerApp.base_url=${NB_PREFIX} --ServerApp.authenticate_prometheus=False'
2023-10-03T19:59:17.158Z [jupyter] [I 2023-10-03 19:59:17.158 ServerApp] jupyter_server_mathjax | extension was successfully linked.
2023-10-03T19:59:17.160Z [jupyter] [I 2023-10-03 19:59:17.160 ServerApp] jupyterlab | extension was successfully linked.
2023-10-03T19:59:17.160Z [jupyter] [I 2023-10-03 19:59:17.160 ServerApp] jupyterlab_git | extension was successfully linked.
2023-10-03T19:59:17.161Z [jupyter] [I 2023-10-03 19:59:17.161 ServerApp] nbclassic | extension was successfully linked.
2023-10-03T19:59:17.161Z [jupyter] [I 2023-10-03 19:59:17.161 ServerApp] nbdime | extension was successfully linked.
2023-10-03T19:59:17.162Z [jupyter] [I 2023-10-03 19:59:17.162 ServerApp] Writing Jupyter server cookie secret to /home/jovyan/.local/share/jupyter/runtime/jupyter_cookie_secret
2023-10-03T19:59:17.582Z [jupyter] [I 2023-10-03 19:59:17.582 ServerApp] notebook_shim | extension was successfully linked.
2023-10-03T19:59:17.617Z [jupyter] [I 2023-10-03 19:59:17.616 ServerApp] notebook_shim | extension was successfully loaded.
2023-10-03T19:59:17.617Z [jupyter] [I 2023-10-03 19:59:17.617 ServerApp] jupyter_server_mathjax | extension was successfully loaded.
2023-10-03T19:59:17.618Z [jupyter] [I 2023-10-03 19:59:17.617 LabApp] JupyterLab extension loaded from /opt/conda/lib/python3.8/site-packages/jupyterlab
2023-10-03T19:59:17.618Z [jupyter] [I 2023-10-03 19:59:17.618 LabApp] JupyterLab application directory is /opt/conda/share/jupyter/lab
2023-10-03T19:59:17.619Z [jupyter] [I 2023-10-03 19:59:17.619 ServerApp] jupyterlab | extension was successfully loaded.
2023-10-03T19:59:17.620Z [jupyter] [I 2023-10-03 19:59:17.620 ServerApp] jupyterlab_git | extension was successfully loaded.
2023-10-03T19:59:17.623Z [jupyter] [I 2023-10-03 19:59:17.623 ServerApp] nbclassic | extension was successfully loaded.
2023-10-03T19:59:17.699Z [jupyter] [I 2023-10-03 19:59:17.699 ServerApp] nbdime | extension was successfully loaded.
2023-10-03T19:59:17.700Z [jupyter] [I 2023-10-03 19:59:17.700 ServerApp] Serving notebooks from local directory: /home/jovyan
2023-10-03T19:59:17.700Z [jupyter] [I 2023-10-03 19:59:17.700 ServerApp] Jupyter Server 1.24.0 is running at:
2023-10-03T19:59:17.700Z [jupyter] [I 2023-10-03 19:59:17.700 ServerApp] http://notebook-pod:8888/lab?token=...
2023-10-03T19:59:17.700Z [jupyter] [I 2023-10-03 19:59:17.700 ServerApp]  or http://127.0.0.1:8888/lab?token=...
2023-10-03T19:59:17.700Z [jupyter] [I 2023-10-03 19:59:17.700 ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).

The above experiment shows that environment variables from the Pod are not passed to command in services.

i-chvets commented 1 year ago

The problem will still exist though, since environment variables from the Pod are not making it to the process. Eg. deployed Pod in 1.7/edge has the following environment variables:

spec:
  containers:
    - env:
        - name: NB_PREFIX
          value: /notebook/admin/test2
        - name: KF_PIPELINES_SA_TOKEN_PATH
          value: /var/run/secrets/kubeflow/pipelines/token

Those are not passed down to the command and their values do not match defaults used in ROCK. And since charm is not really launching Jupyter notebook server we cannot replan the service.

i-chvets commented 1 year ago

Pod environment variables are being passed to the workload, however, if those environment variables are also specified in environment of the corresponding service in Rockcraft project, the values from the Rockcraft project will be taken.

The solution for the issue is:

When ROCK with above requirements is deployed, while creating notebook server, user can connect through UI. The only issue is that it requires configuration of the token and that it displays the following: Screenshot from 2023-10-04 09-30-37

NohaIhab commented 1 year ago

55 will close the issue in track/1.7

fix should be forward ported to main as well