iris-hep / analysis-systems-base

Base Docker image for Analysis Systems environment
https://hub.opensciencegrid.org/harbor/projects/863/repositories/analysis-systems-base
MIT License
2 stars 2 forks source link

Determine what environmental variables select the kernels shown in Jupyter Lab Launcher #12

Open matthewfeickert opened 2 years ago

matthewfeickert commented 2 years ago

Following up on PR #11 and https://atlas-talk.sdcc.bnl.gov/t/how-to-activate-python-virtual-environments-with-custom-images/269/ (ATLAS internal):

What Jupyter Lab environmental variables determine the kernels that are shown on the Launcher? Given https://github.com/jupyter/jupyter/issues/449, https://github.com/jupyter/jupyter/pull/426, and https://github.com/jupyter/jupyter_core/issues/138 I thought that this would be controlled by JUPYTER_DATA_DIR given the docs summary on Data files as

Jupyter uses a search path to find installable data files, such as kernelspecs and notebook extensions. When searching for a resource, the code will search the search path starting at the first directory until it finds where the resource is contained.

Each category of file is in a subdirectory of each directory of the search path. For example, kernel specs are in kernels subdirectories.

I'm running into a tricky situation running a custom image through Singularity on a US ATLAS Jupyter Hub instance that I don't control where though the Launcher will show my kernel when I run locally it will not show up when running Jupyter Lab on the remote Jupyter Hub unless I symlink my kernels into place.

Example of working locally fine

In my Dockerfile in PR #11 I set JUPYTER_DATA_DIR

https://github.com/iris-hep/analysis-systems-base/blob/246398129765843a1fbf824ab040030ebafb6a98/docker/Dockerfile#L107-L108

to be set when /etc/profile is sourced

https://github.com/iris-hep/analysis-systems-base/blob/246398129765843a1fbf824ab040030ebafb6a98/docker/Dockerfile#L34

and then force this with a login shell

https://github.com/iris-hep/analysis-systems-base/blob/246398129765843a1fbf824ab040030ebafb6a98/docker/Dockerfile#L118

Using the image hub.opensciencegrid.org/iris-hep/analysis-systems-base:2022-11-02 and running it locally as a non-root user

docker run \
    --rm \
    -ti \
    --publish 8888:8888 \
    --user $(id -u $USER):$(id -g) \
    hub.opensciencegrid.org/iris-hep/analysis-systems-base:2022-11-02

I'm able to see my "Analysis Systems" kernel in the Launcher

local-jupyter-lab-launcher

and use it just fine.

import pyhf
import cabinetry
import matplotlib
import awkward
import coffea
import dask
import torch

print(f"{pyhf.__version__}")  # 0.7.0
print(f"{cabinetry.__version__}")  # 0.5.0
print(f"{matplotlib.__version__}")  # 3.6.1
print(f"{awkward.__version__}")  # 1.10.1
print(f"{coffea.__version__}")  # 0.7.19
print(f"{dask.__version__}")  # 2022.9.2
print(f"{torch.__version__}")  # 1.12.1+cpu

local-jupyter-lab-running

Note this also works locally without having to set JUPYTER_DATA_DIR or JUPYTER_PATH.

Example of failing on remote

In a Terminal session on the remote the environmental variables from the hub.opensciencegrid.org/iris-hep/analysis-systems-base:2022-11-02 container are still propagating through correctly

Singularity> echo $JUPYTER_PATH
/opt/micromamba/envs/analysis-systems/share/jupyter:/u0b/software/jupyter:/cvmfs/atlas.sdcc.bnl.gov/jupyter/t3s/common
Singularity> echo $JUPYTER_DATA_DIR
/opt/micromamba/envs/analysis-systems/share/jupyter
Singularity>

however the Analysis Systems kernel is absent from the Launcher.

remote-absent-kernel

If however, I symlink the analysis-systems kernel directory from my micromamba virtual environment to ~/.local/share/jupyter/kernels (N.B.: the path that is the default value for JUPYTER_DATA_DIR is ~/.local/share/jupyter/)

Singularity> mkdir -p ~/.local/share/jupyter/kernels
Singularity> ln --symbolic /opt/micromamba/envs/analysis-systems/share/jupyter/kernels/analysis-systems ~/.local/share/jupyter/kernels
Singularity>

the analysis-systems kernel will then appear in the Launcher

remote-symlink-kernel-visible

and also work as expected

remote-jupyter-working

Questions

What environmental variable is actually controlling what gets shown in the Launcher? I thought it was JUPYTER_DATA_DIR but apparently it isn't(?). Why does ~/.local/share/jupyter/ work?

matthewfeickert commented 2 years ago

:wave: @choldgraf, @ivanov, @yuvipanda if I can abuse familiarity with you all, I'm not expecting you to offer suggestions on what strange things are happening (this is firmly in the space of the deployment config and not the software) or even to read the Issue, but I figure that you've all seen way more strange Jupyter Lab / Jupyter Hub things over the years than I have and might at least be able to suggest a "you're gonna have a bad time going that way" or perhaps address the question of

What environmental variable is actually controlling what gets shown in the Launcher? I thought it was JUPYTER_DATA_DIR but apparently it isn't(?).

(which maybe is a better GitHub Discussions question someplace in Jupyter org land).

cc @PerilousApricot given discussions on #analysis-grand-challenge channel on IRIS-HEP Slack.

yuvipanda commented 2 years ago

You can run jupyter --path to find out what paths it is looking at, and fiddle with env vars to figure out which ones it should be looking at.

matthewfeickert commented 2 years ago

You can run jupyter --path to find out what paths it is looking at, and fiddle with env vars to figure out which ones it should be looking at.

Ah, yeah that puts some things in context already:

Singularity> jupyter --path
config:
    /usatlas/u/atlas_feickert/.jupyter
    /usatlas/u/atlas_feickert/.local/etc/jupyter
    /opt/micromamba/envs/analysis-systems/etc/jupyter
    /usr/local/etc/jupyter
    /etc/jupyter
data:
    /opt/micromamba/envs/analysis-systems/share/jupyter
    /u0b/software/jupyter
    /cvmfs/atlas.sdcc.bnl.gov/jupyter/t3s/common
    /opt/micromamba/envs/analysis-systems/share/jupyter
    /usatlas/u/atlas_feickert/.local/share/jupyter
    /opt/micromamba/envs/analysis-systems/share/jupyter
    /usr/local/share/jupyter
    /usr/share/jupyter
runtime:
    /opt/micromamba/envs/analysis-systems/share/jupyter/runtime
Singularity> jupyter --config-dir
/usatlas/u/atlas_feickert/.jupyter
Singularity> jupyter --data-dir
/opt/micromamba/envs/analysis-systems/share/jupyter
Singularity> jupyter --runtime-dir
/opt/micromamba/envs/analysis-systems/share/jupyter/runtime
agoose77 commented 2 years ago

Firstly, you can list the kernels that Jupyter knows about with jupyter kernelspec list. As this is happening remotely only, I suspect it's related to permissions OR user configuration which differ significantly between Docker and Singularity.

You can test what JupyterLab sees by running jupyter kernelspec list --json --debug in the JupyterLab terminal. IIRC :crossed_fingers:, this spawns from the same process that launches JLab, so they should have the same environment.

If you see your kernel there, the next thing is to investigate what's in the kernel.json of the path yielded by jupyter kernelspec list

matthewfeickert commented 2 years ago

Ah sorry @agoose77 I forgot that you can't see everything I put over on https://atlas-talk.sdcc.bnl.gov/t/how-to-activate-python-virtual-environments-with-custom-images/269/. Let me paste everything here for running at BNL with /cvmfs/unpacked.cern.ch/hub.opensciencegrid.org/iris-hep/analysis-systems-base:2022-11-03.

Singularity> jupyter kernelspec list
Available kernels:
  analysis-systems      /opt/micromamba/envs/analysis-systems/share/jupyter/kernels/analysis-systems
  python3               /opt/micromamba/envs/analysis-systems/share/jupyter/kernels/python3
  atlasroot             /u0b/software/jupyter/kernels/atlasROOT
  py2env                /u0b/software/jupyter/kernels/py2env
  py3env                /u0b/software/jupyter/kernels/py3env
  python-ml-anaconda    /u0b/software/jupyter/kernels/python-ml-anaconda
  python-ml-cpu         /u0b/software/jupyter/kernels/python-ml-cpu
  python-ml-gpu         /u0b/software/jupyter/kernels/python-ml-gpu
  sphenix-env           /u0b/software/jupyter/kernels/sphenix-env
  sphenix-root          /u0b/software/jupyter/kernels/sphenix-root
  pyroot2               /cvmfs/atlas.sdcc.bnl.gov/jupyter/t3s/common/kernels/pyroot2
  pyroot3               /cvmfs/atlas.sdcc.bnl.gov/jupyter/t3s/common/kernels/pyroot3
  rootcpp               /cvmfs/atlas.sdcc.bnl.gov/jupyter/t3s/common/kernels/rootcpp
Singularity> cat /opt/micromamba/envs/analysis-systems/share/jupyter/kernels/analysis-systems/kernel.json 
{
 "argv": [
  "/opt/micromamba/envs/analysis-systems/bin/python",
  "-m",
  "ipykernel_launcher",
  "-f",
  "{connection_file}"
 ],
 "display_name": "Analysis Systems",
 "language": "python",
 "metadata": {
  "debugger": true
 }
}Singularity> jupyter kernelspec list --json --debug | jq '.["kernelspecs"]["analysis-systems"]'
...
{
  "resource_dir": "/opt/micromamba/envs/analysis-systems/share/jupyter/kernels/analysis-systems",
  "spec": {
    "argv": [
      "/opt/micromamba/envs/analysis-systems/bin/python",
      "-m",
      "ipykernel_launcher",
      "-f",
      "{connection_file}"
    ],
    "env": {},
    "display_name": "Analysis Systems",
    "language": "python",
    "interrupt_mode": "signal",
    "metadata": {
      "debugger": true
    }
  }
}

So it definitely knows about the kernel, but won't display it in the Launcher unless I symlink it

Singularity> mkdir -p ~/.local/share/jupyter/kernels
Singularity> ln --symbolic /opt/micromamba/envs/analysis-systems/share/jupyter/kernels/analysis-systems ~/.local/share/jupyter/kernels
Singularity>
agoose77 commented 2 years ago

If you launch JupyterLab, and visit the /api/kernelspecs endpoint, do you see your kernel enumerated there? I'd imagine not if you have no launcher item.

When you run JupyterLab, can you view the stdout of the JLab process, or launch it to write to a debug file? With debug options, you should see a complaint about the kernelspec handler if it fails to load a particular kernelspec in the non-symlinked case.

Could you confirm that jupyter --paths run from the JLab terminal indicates that your env var is being seen at the point that JLab is launched? I.e., not from the Singularity shell but from a JLab console (terminal), to ensure the same environment as the JLab launcher process.

The KernelSpecManager looks up kernels via jupyter_path('kernels'), so it would be interesting to see what python3 -c "import jupyter_core.paths; print(jupyter_core.paths.jupyter_path('kernels'))" yields from the JLab terminal