jupyterhub / batchspawner

Custom Spawner for Jupyterhub to start servers in batch scheduled systems
BSD 3-Clause "New" or "Revised" License
190 stars 134 forks source link

Non-configurable req_keepvars_extra #281

Closed NPotvin closed 8 months ago

NPotvin commented 1 year ago

Bug description

While configuring a JupyterHub server on a cluster, I encountered several issues related to environment variables not being properly exported on the compute nodes by the batchspawner.

Expected behaviour

In the default batch script, a line is responsible for exporting the variables (e.g. #SBATCH --export={{keepvars}} for slurm). The keepvars variable is basically a concatenation of two coma separated lists of variables req_keepvars and req_keepvars_extra. The former is not configurable and already contains some required variables. The latter should be configurable and is expected to provide the user with a convenient way to pass additional environment variables to the batch script.

Actual behaviour

The attribute BatchSpawnerBase.req_keepvars_extra does not seem to be used when producing the submitted batch script. The extra variables are not present in the list of exported environment variables.

Workaround

Overriding the default batch script in the config by hardcoding the variables to export does the trick but I am not comfortable with that approach. In the future, the idea would be to load the shell script from a file that should be agnostic of those variables. (In other words, I want to define a list of additional environment variables in the config, and keep {keepvars} at the top of a .sh file read in the jupyterhub config)

Proposed fix

I managed to get the expected behaviour by changing this segment of code:

#batchspawner.py: line 169

    req_keepvars_extra = Unicode(
        help="Extra environment variables which should be configured, "
        "added to the defaults in keepvars, "
        "comma separated list.",
    )

and adding .tag(config=True):

    req_keepvars_extra = Unicode(
        help="Extra environment variables which should be configured, "
        "added to the defaults in keepvars, "
        "comma separated list.",
    ).tag(config=True)

How to reproduce

  1. Install a jupyterhub server on a cluster
  2. Configure it to use a batch spawner that inherits from BatchSpawnerBase
  3. Try to export additional environment variables using c.BatchSpawnerBase.req_keepvars_extra = "YOUR,VARS,HERE"
  4. Start the server (in my case using systemctl) and see it fail when attempting to spawn a new batch job (checking the produced script reveals that the environment variables do not contain the extra variables)
welcome[bot] commented 1 year ago

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! :hugs:
If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively. welcome You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! :wave:
Welcome to the Jupyter community! :tada:

mark-tomich commented 10 months ago

It seems I can't tag the PR in development section, but here it is: https://github.com/jupyterhub/batchspawner/pull/295