jupyter-server / enterprise_gateway

A lightweight, multi-tenant, scalable and secure gateway that enables Jupyter Notebooks to share resources across distributed clusters such as Apache Spark, Kubernetes and others.
https://jupyter-enterprise-gateway.readthedocs.io/en/latest/
Other
620 stars 223 forks source link

user impersonation not setting $HOME or ~ location #1160

Closed blair-anson closed 1 year ago

blair-anson commented 2 years ago

Not sure if this is a bug, or a lack of understanding on my part.

I have user impersonation setup, so these commands all work as expected in the notebook...

!whoami
user1
!pwd
/home/user1
!id
uid=1001(user1) gid=1001(user1)

However these commands still show jovyan as being the user instead of user1...

!echo $HOME
/home/jovyan
!ls ~
group  passwd  work    <<--- this is an ls of /home/jovyan

Also I configure a custom PYTHONPATH env variable but it is not visible in the kernel


Configuration

values.yaml

mirrorWorkingDirs: true

(so EG_MIRROR_WORKING_DIRS is set)

deployment.yaml I also set a whitelist for a custom python path in spec.template.spec.containers.env

        - name: EG_ENV_WHITELIST
          value: "PYTHONPATH"
        - name: KG_ENV_WHITELIST
          value: "PYTHONPATH"

kernel.json This may not be necessary but I put a placeholder for the whitelisted env variable in the kernel spec

"env": {
    "PYTHONPATH": ""
  }

JupyterLab Spawner env variables

env['KERNEL_USERNAME'] = self.user.name
env['JUPYTER_GATEWAY_URL'] = "https://xxxxxxxxx:8888"
env['JUPYTER_GATEWAY_REQUEST_TIMEOUT'] = '120'
env['KERNEL_WORKING_DIR'] = f'/home/{self.user.name}'
env['KERNEL_USER_HOME'] = f'/home/{self.user.name}'
user = pwd.getpwnam(self.user.name)
env['KERNEL_UID'] = str(user.pw_uid)
env['KERNEL_GID'] = str(user.pw_gid)
env['USER'] = self.user.name
env['HOME'] = f'/home/{self.user.name}'
env['KG_HTTP_USER'] = self.user.name
env['PYTHONPATH'] = f'/home/{self.user.name}'/lib
kevin-bates commented 2 years ago

hi @blair-anson. We don't currently flow the user's home directory, only the working directory - which all appears to be working. I don't think we can assume that just because KERNEL_WORKING_DIR has a value, that value is the user's HOME directory, so I think we'd need another vehicle to convey this information and notice that you created KERNEL_USER_HOME.

Although this would only apply to containerized kernels (docker, k8s), I wonder if we could honor KERNEL_USER_HOME such that the container launcher scripts do something like env['HOME'] = os.getenv('KERNEL_USER_HOME', '/home/jovyan')?

To address this in 2.6, you could modify either launch_kubernetes.py or kernel-pod.yaml.j2 to set HOME accordingly for your scenario.

Once you have that working, a contribution would be appreciated. Seems like the semantics could also be that if KERNEL_USER_HOME is set, that implies a KERNEL_WORKING_DIR value (if not set), whereas we can't say the same in the other direction. And, in both cases, I believe we'd need to say that EG_MIRROR_WORKING_DIRS is required.

Regarding the ENV_WHITELIST setting, this is really strange and I don't know where that would be coming from - it's not coming from anything in github - so much be configured within your EG deployment scripts. The env-whitelist (renamed in 3.0 to EG_CLIENT_ENVS) specifies the list of environment variable names that are allowed to flow from the client (Jupyter Lab). By also setting it the way you have in the kernelspec, the kernel will be launched with an empty PYTHONPATH, so probably not what you intended. If you wanted the PYTHONPATH value to flow from the client (which, based on the spawner variables, is the intention), you'd either want to remove that entry (since EG_ENV_WHITELIST will ensure it flows (assuming the client is actually setting it in the first place)) or update your entry to:

"env": {
    "PYTHONPATH": "${PYTHONPATH}"
  }

Also note that to ensure this env flows from the Lab client, the lab configuration should set JUPYTER_GATEWAY_ENV_WHITELIST, otherwise the gateway integration won't include PYTHONPATH in the kernel start request payload. (Note that this env/configuration option has been renamed to JUPYTER_GATEWAY_ALLOWED_ENVS in Jupyter Server 2.0.). What version of Lab are you using and do you know if its using Notebook or Jupyter Server as its web server?

blair-anson commented 2 years ago

Hi @kevin-bates thank you for the comprehensive and very helpful response.

We don't currently flow the user's home directory, only the working directory

Ah that explains why I wasn't getting anywhere. No worries though, I can customise JEG as per your suggestions. Passing the environment variables was blocking me from progressing with the customisation but with your answer I should be able to progress. I'll write another comment here with my progress.

Regarding the ENV_WHITELIST setting, this is really strange and I don't know where that would be coming from - it's not coming from anything in github - so much be configured within your EG deployment scripts

Apologies that came from me Googling for how to set custom env variables that don't begin with KERNEL_. EG_ENV_WHITELIST was in various github issues and some older documentation, so I thought I'd try it as the other settings weren't working. KG_ENV_WHITELIST was from here https://jupyter-enterprise-gateway.readthedocs.io/en/v2.6.0/config-options.html

Also note that to ensure this env flows from the Lab client, the lab configuration should set JUPYTER_GATEWAY_ENV_WHITELIST, otherwise the gateway integration won't include PYTHONPATH in the kernel start request payload. (Note that this env/configuration option has been renamed to JUPYTER_GATEWAY_ALLOWED_ENVS in Jupyter Server 2.0.). What version of Lab are you using and do you know if its using Notebook or Jupyter Server as its web server?

I am currently using JupyterLab v3.0.16 which I believe specifies these versions.

jupyterlab_server~=2.3
jupyter_server~=1.4

I actually run JupyterHub up on the server with the JupyterLab version I specified above (hence why I created my own spawner), but to simply things when trying out different configurations for JEG I run JupyterLab locally using a command like this. Again this is JupyterLab v3.0.16 but I have also been experimenting with v3.4. Thanks for the warning about the change of env whitelist name, I will keep it in mind when I upgrade JupyterLab in the future

KERNEL_UID=1001 \
KERNEL_GID=1001 \
KERNEL_USERNAME=user1 \
KERNEL_WORKING_DIR=/home/user1 \
KERNEL_VOLUMES="[{name: 'nfs-volume', nfs: {server: 'fs-xxxxx.efs.us-west-2.amazonaws.com', path: '/user1'}}]" \
KERNEL_VOLUME_MOUNTS="[{name: 'nfs-volume', mountPath: '/home/user1'}]" \
jupyter lab --gateway-url=https://xxxxxxx:8888 --GatewayClient.http_user=guest --GatewayClient.http_pwd=guest-password
blair-anson commented 2 years ago

I started on trying to pass a custom env variable to the kernel, and I am still stuck. Instead of PYTHONPATH I defined BLAIRENV as a test, as that won't have any impact to python and allows me to test just the environment variable passing.

I remove the env variable from the kernel spec kernel.json

...
"env": {
}
...

In JEG I set these env variables. I originally tried just EG_ENV_WHITELIST but then when that did not work I also added EG_CLIENT_ENVS in case I mistaken about the JupyterServer version deployment.yaml

...
        - name: EG_CLIENT_ENVS
          value: "BLAIRENV"
        - name: EG_ENV_WHITELIST  # renamed to EG_CLIENT_ENVS in JEG 3.0
          value: "BLAIRENV"
...

In the JupyterLab spawner on JupyterHub I have these env variables set

env['BLAIRENV'] = "catsndogs"
env['JUPYTER_GATEWAY_ENV_WHITELIST'] = "BLAIRENV"
env['JUPYTER_GATEWAY_ALLOWED_ENVS'] = "BLAIRENV"

However after all that I still don't see the BLAIRENV in the kernel. Any env prefixed with KERNEL_ do get passed through but BLAIRENV does not. Is there some other configuration I should try?

kevin-bates commented 2 years ago

Yeah, I just tried this and see the same thing (using KEVINENV=42). I can see it has flowed to the EG and is available to the kernel launch...

[D 2022-09-24 15:43:47.800 EnterpriseGatewayApp] BaseProcessProxy.launch_process() env: {'SHELL': '/bin/bash', 'KUBERNETES_SERVICE_PORT_HTTPS': '443', 'EG_MIRROR_WORKING_DIRS': 'False', 'KUBERNETES_SERVICE_PORT': '443', 'ENTERPRISE_GATEWAY_PORT_8877_TCP': 'tcp://10.43.139.10:8877', 'EG_NAMESPACE': 'enterprise-gateway', 'ENTERPRISE_GATEWAY_SERVICE_PORT_HTTP': '8888', 'HOSTNAME': 'enterprise-gateway-6c8749c669-qrdtt', 'LANGUAGE': 'en_US.UTF-8', 'EG_SHARED_NAMESPACE': 'False', 'EG_PORT': '8888', 'EG_LOG_LEVEL': 'DEBUG', 'JAVA_HOME': '/usr/lib/jvm/java-8-openjdk-amd64', 'ENTERPRISE_GATEWAY_SERVICE_PORT_RESPONSE': '8877', 'EG_KERNEL_WHITELIST': '"r_kubernetes","python_kubernetes","python_tf_kubernetes","python_tf_gpu_kubernetes","scala_kubernetes","spark_r_kubernetes","spark_python_kubernetes","spark_scala_kubernetes","spark_python_operator"', 'NB_UID': '1000', 'ENTERPRISE_GATEWAY_SERVICE_HOST': '10.43.139.10', 'PWD': '/usr/local/bin', 'ENTERPRISE_GATEWAY_PORT_8877_TCP_PROTO': 'tcp', 'EG_CULL_IDLE_TIMEOUT': '3600', 'EG_DEFAULT_KERNEL_NAME': 'python_kubernetes', 'ENTERPRISE_GATEWAY_PORT_8888_TCP_PORT': '8888', 'EG_ENABLE_TUNNELING': 'False', 'ENTERPRISE_GATEWAY_PORT_8888_TCP_ADDR': '10.43.139.10', 'EG_KERNEL_LAUNCH_TIMEOUT': '60', 'HOME': '/home/jovyan', 'LANG': 'en_US.UTF-8', 'KUBERNETES_PORT_443_TCP': 'tcp://10.43.0.1:443', 'ENTERPRISE_GATEWAY_PORT_8877_TCP_PORT': '8877', 'EG_LIST_KERNELS': 'True', 'EG_SSH_PORT': '2122', 'NB_GID': '100', 'EG_RESPONSE_PORT': '8877', 'ENTERPRISE_GATEWAY_PORT_8888_TCP': 'tcp://10.43.139.10:8888', 'KG_PORT': '8888', 'EG_CULL_CONNECTED': 'False', 'EG_PORT_RETRIES': '0', 'KG_IP': '0.0.0.0', 'ENTERPRISE_GATEWAY_PORT_8877_TCP_ADDR': '10.43.139.10', 'EG_CULL_INTERVAL': '60', 'EG_IP': '0.0.0.0', 'SHLVL': '0', 'CONDA_DIR': '/opt/conda', 'ENTERPRISE_GATEWAY_SERVICE_PORT': '8888', 'SPARK_HOME': '/opt/spark', 'KUBERNETES_PORT_443_TCP_PROTO': 'tcp', 'KG_PORT_RETRIES': '0', 'KUBERNETES_PORT_443_TCP_ADDR': '10.43.0.1', 'SPARK_VER': '2.4.6', 'ENTERPRISE_GATEWAY_PORT': 'tcp://10.43.139.10:8888', 'NB_USER': 'jovyan', 'KUBERNETES_SERVICE_HOST': '10.43.0.1', 'ENTERPRISE_GATEWAY_PORT_8888_TCP_PROTO': 'tcp', 'LC_ALL': 'en_US.UTF-8', 'KUBERNETES_PORT': 'tcp://10.43.0.1:443', 'KUBERNETES_PORT_443_TCP_PORT': '443', 'PATH': '/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin', 'EG_ENV_WHITELIST': 'KEVINENV', 'EG_KERNEL_CLUSTER_ROLE': 'kernel-controller', 'DEBIAN_FRONTEND': 'noninteractive', 'KEVINENV': '42', 'KERNEL_LAUNCH_TIMEOUT': '40', 'KERNEL_USERNAME': 'jovyan', 'KERNEL_GATEWAY': '1', 'KERNEL_POD_NAME': 'jovyan-76aed5ed-02e7-468f-a3ee-84c154d444c9', 'KERNEL_SERVICE_ACCOUNT_NAME': 'default', 'KERNEL_NAMESPACE': 'jovyan-76aed5ed-02e7-468f-a3ee-84c154d444c9', 'KERNEL_IMAGE': 'elyra/kernel-py:2.6.0', 'KERNEL_EXECUTOR_IMAGE': 'elyra/kernel-py:2.6.0', 'KERNEL_UID': '1000', 'KERNEL_GID': '100', 'EG_MIN_PORT_RANGE_SIZE': '1000', 'EG_MAX_PORT_RANGE_RETRIES': '5', 'KERNEL_ID': '76aed5ed-02e7-468f-a3ee-84c154d444c9', 'KERNEL_LANGUAGE': 'python', 'EG_IMPERSONATION_ENABLED': 'False'}

but the launcher script and jinja template is only setting a fixed set of envs and only based on the keyword set - which doesn't include anything that is not KERNEL_-prefixed.

I think the kubernetes launcher needs to post-process the env stanza of the generated k8s pod yaml and set the remaining envs - which in the k8s launch is lots of meaningless stuff (some of which I'm not sure would side-affect things).

This seems like an issue we should try to fix for 3.0 GA. It only applies to Kubernetes since the other process-proxies (including docker) don't go through a template.

Sorry for the inconvenience. If you have a fixed set of envs you'd like to flow, I suppose you could extend the keywords in the launcher script and add entries of each env name in the jinja template - but that's a bit of a pain and it might be easier to just extend the env stanza following the yaml's generation and implement the correct solution. Just remember to recognize any envs that might already be present.

Is this something you'd like to contribute to our 3.0 release?

blair-anson commented 2 years ago

Ok thank you for confirming it. I will take a look and see if I can get it working on my fork. If I do I will look see if I can contribute it to 3.0

kevin-bates commented 2 years ago

Hi @blair-anson. Given that we're close to our 3.0 GA release, I've started looking into this. I hope that's okay with you.

I want to make sure that this code, now that the env will truly find its way into the kernel pod, doesn't have any side-effects.

kevin-bates commented 2 years ago

I should have a PR soon - currently looking at CI issues that appear unrelated. I suspect the current failures are due to a third-party dependency update since it doesn't reproduce in my current env. Here's the branch if you're interested.

blair-anson commented 2 years ago

Given that we're close to our 3.0 GA release, I've started looking into this. I hope that's okay with you.

That's perfectly understandable. Although I did implement a fix in my codebase I have not had time to test it properly, let alone look at the 3.0 codebase. Thank you for being so proactive.

kevin-bates commented 1 year ago

Closed via #1164