jupyter-server / kernel_gateway

Jupyter Kernel Gateway
http://jupyter-kernel-gateway.readthedocs.org/en/latest/
Other
504 stars 130 forks source link

After enable kernelgateway, pwd env does not change #342

Open nimengliusha opened 4 years ago

nimengliusha commented 4 years ago

image

kevin-bates commented 4 years ago

What do you mean by "enable kg"? And what is your "pwd" test attempting to show - please explain?

Kernel Gateway must be running before a notebook configured to use it is started. To configure the notebook server to use Kernel Gateway, it must be launched with the --gateway-url parameter, or the equivalent is configured into the notebook's configuration file. It cannot be "enabled" within a given session of Notebook - as it appears you're trying to show above.

nimengliusha commented 4 years ago

Hi, @kevin. I created a file named 'test.ipynb' at /home/ma-user/work/test. If i did not enable kernelgateway, the pwd result was correct which showed /home/ma-user/work/test. The i changed the code to pass the gateway url and restarted the docker container. The pwd result turned into /home/ma-user/work which was wrong.

Even though i passed PWD env at POST kernel code as env.update({'PWD': env['KERNEL_WORKING_DIR']}). The result still goes /home/ma-user/work

image

[KernelGatewayApp] Starting kernel: ['/home/ma-user/anaconda3/envs/TensorFlow-1.8/bin/python', '-m', 'ipykernel_launcher', '-f', '/home/ma-user/.local/share/jupyter/runtime/kernel-accf88d7-d665-4cc5-8529-d9332fa0e7bf.json'] [KernelGatewayApp] Connecting to: tcp://127.0.0.1:45464 [KernelGatewayApp] Connecting to: tcp://127.0.0.1:52860 [KernelGatewayApp] Kernel started: accf88d7-d665-4cc5-8529-d9332fa0e7bf [KernelGatewayApp] Kernel args: {'env': {'PATH': '/usr/local/cloudguard/seccrypto/bin/:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games', 'MXNET121_NAME': 'MXNet-1.2.1', 'S3_ENDPOINT': '', 'HOSTNAME': '91c368767048', 'NB_USER': 'ma-user', 'SHELL': '/bin/bash', 'LC_ALL': 'en_US.UTF-8', 'PT100_NAME': 'Pytorch-1.0.0', 'USER': 'ma-user', 'LD_LIBRARY_PATH': '/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64:/usr/local/nvidia/lib64', 'CONDA3_DIR': '/opt/conda/bin', 'PYTHON_VERSION': '3.6', 'MAIL': '/var/mail/ma-user', 'ANCONDA_DIR': '/home/ma-user/anaconda3','PWD': '/home/ma-user/work/test', 'TF18_NAME': 'TensorFlow-1.8', 'LANG': 'en_US.UTF-8', 'MINICONDA_DIR': '/opt/conda', 'SHLVL': '3', 'HOME': '/home/ma-user', 'LANGUAGE': 'en_US.UTF-8', 'ANACONDA_DIR': '/home/ma-user/anaconda3', 'PYTHONPATH': '', 'LOGNAME': 'ma-user', 'DEBIAN_FRONTEND': 'noninteractive', 'CONDA_DIR': '/opt/conda', 'NB_GID': '100', 'SCC_CONF': '/usr/local/seccrypto/kmc.conf', 'TF113_NAME': 'TensorFlow-1.13.1', 'NBUID': '1000', '': '/opt/conda/bin/jupyter', 'KERNEL_LAUNCH_TIMEOUT': '40', 'KERNEL_WORKING_DIR': '/home/ma-user/work/test'}, 'kernel_name': 'tensorflow-1.8'} [I 200916 08:18:29 web:2250] 201 POST /api/kernels (172.17.0.6) 14.10ms [I 200916 08:18:29 web:2250] 200 GET /api/kernels/accf88d7-d665-4cc5-8529-d9332fa0e7bf (172.17.0.6) 1.02ms [I 200916 08:18:29 web:2250] 200 GET /api/kernels/accf88d7-d665-4cc5-8529-d9332fa0e7bf (172.17.0.6) 0.78ms

jupyter_kernel_gateway v2.4.1, Notebook v6.0.3

nimengliusha commented 4 years ago

After enable kernelgateway, no matter where i create the file, the 'pwd' result is the same as where i start the kernelgateway.

cd /home/ma-user/work && su ${NB_USER} -s /bin/bash -c "sh -x /usr/local/bin/start_gateway.sh"

nimengliusha commented 4 years ago

After analyzing related code, i think the reason for above problem is missing 'path' parameter while calling 'start_kernel' method, a bug? To fix which, we need changes of Notebook and KernelGateway code。

https://github.com/jupyter/notebook/blob/master/notebook/services/kernels/handlers.py
line46, changes from kernel_id = yield maybe_future(km.start_kernel(kernel_name=model['name'])) to

if 'path' in model:
    kernel_id = yield maybe_future(km.start_kernel(kernel_name=model['name'], path=model['path']))
else:
    kernel_id = yield maybe_future(km.start_kernel(kernel_name=model['name']))

https://github.com/jupyter/notebook/blob/master/notebook/gateway/managers.py line372, changes from json_body = json_encode({'name': kernel_name, 'env': kernel_env}) to
json_body = json_encode({'name': kernel_name, 'env': kernel_env, 'path': path})

https://github.com/jupyter/kernel_gateway/blob/master/kernel_gateway/services/kernels/handlers.py line69, changes from self.kernel_manager.start_kernel = partial(self.kernel_manager.start_kernel, env=env) to

 if 'path' in model:
        self.kernel_manager.start_kernel = partial(self.kernel_manager.start_kernel, env=env, path=model['path'])
 else:
        self.kernel_manager.start_kernel = partial(self.kernel_manager.start_kernel, env=env)
kevin-bates commented 4 years ago

The gateway projects (Kernel and Enterprise) are intended to run remotely from the notebook server and, as a result, there's no guarantee that a given path from one server will exist on another and be the same actual location.

Kernels started in notebook go through the session manager - who's model knows about 'path' - and converts that to 'cwd' which is what Popen uses to change the kernel process's working directory.

EG uses KERNEL_WORKING_DIRECTORY for its containerized kernels but that's only used in environments where the user's directories are mapped into the container.

Are you running JKG local to your notebook server? If so, why? If remote, are you mounting your home directory on the JKG node?

If we made some change, if think merely flowing KERNEL_WORKING_DIRECTORY into cwd=<value> would be sufficient in JKG's handler.

nimengliusha commented 4 years ago

Thanks for your reply, @kevin-bates. I'm running JKG remotely by mounting work directory on the JKG node. Flowing KERNEL_WORKING_DIRECTORY into cwd=<value> in JKG's handler is also compatible with this scenario.

kevin-bates commented 4 years ago

The KERNEL_WORKING_DIR approach avoids multiple repo updates, so would be preferred. The solution should also confirm that the referenced location exists since its original intention was for containerized kernels that EG supports. But if the reference exists, it seems like a viable solution in general. The same change would be made in Enterprise Gateway, if you're interested.

nimengliusha commented 4 years ago

I'll stick around the related changes. @kevin-bates, Thanks for all your great job here.

roma-glushko commented 10 months ago

The gateway projects (Kernel and Enterprise) are intended to run remotely from the notebook server and, as a result, there's no guarantee that a given path from one server will exist on another and be the same actual location.

Hey @kevin-bates, I have a bit different use case, but want to achieve somewhat similar thing - make Kernel Gateway (KG) create a kernel process under needed CWD.

In my case, KG is running in a container exposing its API. Then, a custom code (no Jupyter Notebook code is involved) control kernel creation and code execution via that API.

Is there a way to instruct KG to create a new kernel under specific CWD? I may have a few kernels that may need to have different CWDs (so should not be a global config, but set on the level of specific kernel, perhaps during creation).

kevin-bates commented 10 months ago

Hi @roma-glushko - this is not possible for local kernels (launched from JKG) without changes. The container-based kernels provided by EG and gateway-provisioners do flow KERNEL_WORKING_DIR to the respective containers. For local kernels, I think you'd need to do this (from above):

If we made some change, if think merely flowing KERNEL_WORKING_DIRECTORY into cwd= would be sufficient in JKG's handler.

Or subclass LocalProvisioner to pull the env and set as cwd prior to kernel launch (probably by overriding its pre_launch() method) then reference that provisioner in your kernelspecs .