Closed dstndstn closed 5 years ago
here is the wrapspawner issue I filed: https://github.com/jupyterhub/wrapspawner/issues/24
PS, I should have added: I have a (Slurm) BatchSpawner wrapped in a ProfileSpawner, for a typical academic compute cluster use case -- users can either start their notebook processes on the head node, or on a compute node via Slurm.
Thanks @dstndstn for the feedback. In case you weren't aware, the functionality you are testing was merged into master only a couple of weeks ago, so this testing is very valuable to us! That said, I don't have an immediate answer for why you are seeing this interaction or the issue you reported in #126 . In this case, I will need to look at whether this is something WrapSpawner should be propagating, or if there is another mechanism failing. Thanks in advance for your patience, and do note that if you don't need the latest features, the current release on PyPI is well tested.
I think that this can be closed now, becasue a fix has been added. There may be more remaining issues, but they can be dealt with later
Hi all, I’m currently in the process of installing Jupyterhub with the goal of better utilising Jupyter on a Slurm cluster, but I’m encountering the same problem documented both here and https://github.com/jupyterhub/wrapspawner/issues/24. For context, I am also using batchspawner with profilesspawner.
The port sent by the single-user server when spawned is being received by the batchspawner API endpoint correctly, but the value is being sent to user.spawner
- an instance of profilesspawner, rather than spawner.child_spawner
- an instance of batchspawner. This leaves batchspawner waiting for a port that it will never receive and causes the launch to time out.
By implementing @dstndstn's patch in the API endpoint that sets the port attribute of spawner.child_spawner
rather than spawner
itself, the single-user session spawns correctly and the user is redirected to the lab interface as expected.
I’ve attached two config files – one, a basic configuration without the patch that leads to the problem and the other a basic configuration with the patched API endpoint that solves the problem. I've also attached logs generated when Jupyterhub is run with the patch - JupyterHub log, Single-user log, and logs generated when run without - JupyterHub log, Single-user log, and an export of my Conda environment for reference.
Can anyone suggest how this can be solved without a temporary patch to the API handler? Could this simply be a misconfiguration on my part, or would it require more in-depth modifications to batchspawner itself?
Also, with the temporary patch in place and the server spawning correctly, the error ‘RuntimeError: No child spawner yet exists - can not get progress yet’ is still present in the JupyterHub output immediately after submitting a spawn job. Is it safe to assume that this is harmless, or should this be addressed too?
Thanks in advance.
Hi,
I'm not sure if this should be a batchspawner issue or a profilespawner issue. I'll file an issue on profilespawner also.
I was finding that the batchspawner API handler was getting called by the slurm-launched worker, but the batchspawner.poll() process was not noticing it.
Digging into the API handler, I saw that the user.spawner object is a:
and therefore, setting the
.current_port
member does not put the port number in the right place for batchspawner.poll() to notice it.I added the following, here https://github.com/jupyterhub/batchspawner/blob/master/batchspawner/api.py#L12 , and it works (but is ugly):
Now, it seems like a cleaner solution might be for WrapSpawner to propagate some setattr requests to its child_spawner?
Open to suggestions for a clean fix. Thanks!