jupyterhub / batchspawner

Custom Spawner for Jupyterhub to start servers in batch scheduled systems
BSD 3-Clause "New" or "Revised" License
190 stars 134 forks source link

Pass environment variables through ssh #123

Open zonca opened 6 years ago

zonca commented 6 years ago

I deployed batchspawner again on Comet with a deployment quite similar to my old setup: https://github.com/jupyterhub/jupyterhub-deploy-hpc/tree/master/batchspawner-xsedeoauth-sshtunnel-sdsccomet Here I ssh into a Comet login node to submit jobs, therefore I need to have the JupyterHub environment variables passed through the SSH session so that it is then passed into the SLURM job. I am not sure how this could have worked in my old deployment, so it is possible I am missing something. In my newer deployment I have to explicitely call ssh passing all the variables, see: https://gist.github.com/zonca/55f7949983e56088186e99db53548ded#file-spawner-py-L42

Everything works fine, but there must be a better way, any suggestions?

I'll contribute this as another example for https://github.com/jupyterhub/jupyterhub-deploy-hpc

rkdarst commented 5 years ago

This doesn't look too bad to me. There is now an exec_prefix command for which this sort of makes sense, but really it doesn't matter and I'd leave well enough alone. I don't know how it would have worked through ssh without something like this. Overall I'd say you are operating in the normal unixy range here, but someone smarter than me may have a better idea.

One idea would be to set these variables in the batch script itself, but I don't think we have the plumbing to get the variables and values there easily.

rkdarst commented 5 years ago

Looking at this again...

The SSH server has to accept the variables via AcceptEnv in sshd_config (the default is to accept nothing). Perhaps this was configured on the server before but isn't anymore?

Do you think this issue can be closed now or should we try to do more?

jcwomack commented 5 months ago

Apologies for reviving an old issue! I thought it worth sharing the solution I recently arrived at for passing environment variables through SSH.

We are running the hub and proxy components of JupyterHub in a container which connects to a login node over SSH to run workload manager commands. This means we needed a solution for passing the various environment variables needed by spawned processes from the hub through to the workload manager submit command.

I came up with a solution inspired by @zonca's Gist, but which uses Jinja2 templating to extract environment variables from keepvars template variable and explicitly set these in the environment of the workload manager submit command (sbatch in my case):

SSH_CMD=["ssh",
    "-i", get_ssh_key_file(),
    "-oStrictHostKeyChecking=no",
    "-oUserKnownHostsFile=/dev/null",
    "localhost",
]
c.SlurmSpawner.exec_prefix = " ".join(SSH_CMD)
c.SlurmSpawner.batch_submit_cmd = " ".join(
    [
        "env", "{% for var in keepvars.split(',') %}{{var}}=\"'${{'{'}}{{var}}{{'}'}}'\" {% endfor %}",
        "sbatch --parsable",
    ]
)

This produces a spawner submit command line that looks like

ssh -i /path/to/keyfile -oStrictHostKeyChecking=no -oUserKnownHostsFile=/dev/null localhost env VAR1="'${VAR1}'" VAR2="'${VAR2}'" [...] sbatch --parsable

with the "'${VAR}'" parameters being expanded in the local shell before the env [variable assignment] <cmd> is run remotely by ssh. In this case the SSH server is on the same host as the container is running on, so we SSH to localhost. get_ssh_key_file() is a function in the JupyterHub config that returns a path to an SSH key file to use.

[!NOTE] Great care must be taken with quote removal/expansion here. The command line ssh ... env [variable assignment] <cmd> undergoes expansion/quote removal in the local shell before env [variable assignment] <cmd> is passed to ssh to be run remotely. In general it seems that this means that any part of the command that would normally need quoting when running <cmd> locally needs to be double quoted (protecting the inner quotes from local shell quote removal).

This impacts SlurmSpawner's default batch_query_cmd, which uses single quotes to wrap a format string for squeue

    batch_query_cmd = Unicode("squeue -h -j {job_id} -o '%T %B'").tag(config=True)

To ensure this is correctly interpreted the format string must be double-quoted. In my JupyterHub configuration I use

c.SlurmSpawner.batch_query_cmd = "squeue -h -j {job_id} -o \"'%T %B'\""