Open rkdarst opened 5 years ago
Right. I think we ended up getting these mental models muddled together when adding req_keepvars
. What is actually does is very limited: for batch systems that support this, it names environment variables that should be propagated from the submission environment into the execution environment. Without additional code, that's only useful for getting the JHub API key into the server session. But there seem to be various people hacking in similar things to pass through other kinds of access tokens.
I think your mental model outlined makes sense, but can we get some contributors to throw in concrete use cases so we're not just talking abstractly? @cmd-ntrf ? @zonca ? @jupyterhub/designers ?
I think this is a good idea. I always had trouble getting the right environment into Comet, last time I deployed I worked around this explicitly propagating the environment via ssh, see https://gist.github.com/zonca/55f7949983e56088186e99db53548ded#file-spawner-py-L55 (from https://zonca.github.io/2018/11/jupyterhub-supercomputer.html)
I implemented this in resurrected PR #119
It's a bit annoying that this changes semantics yet again, but I think it's better to get it done before the release.
Next up: we could add all_envs
to the subvars which is the raw dictionary. Then, one can do something in the batch script like this (but we should not recommend this, storing the API token in the batch script is not secure with some batch systems!):
{% for key, value in all_env.items() %}
export {{key}}={{value}}
{% endfor %}
... maybe for security this just shouldn't be added, since it will just lead to people doing the above - or something similar like leaking tokens by /proc/*/cmdline...
I think we need to rethink environment handling:
Currently there is
req_keepvars
which sets{{keepvars}}
in the batch script to a comma separated list, but does not add them to the environment itself. The environment which is gotten from.get_env()
which comes from JupyterHub, and JupyterHub gives only what is needed.Problems:
req_keepvars
doesn't add new variables to the env dict which is passed to spawnersI am working on rewrite (started from #119 long ago) to improve things. But it's become clear that we need a new mental model of how the variables even work.
Here is what I propose:
.get_env()
(get_env
is a JH concept)req_keepvars
, which is only the extra variables which should be added, not the defalut.req_keepvars_default
which take the place of oldreq_keepvars
. Or this can just be left out (my first implementation will have it to avoid too many problems..get_env()
admin_environment
which takes the same form asreq_keepvars
.run_command()
(which executes the batch commands, poll commands, cancel commands, and so on) always gets the admin environment{{keepvars}}
to limit what the user has access to.What do you think?