Open tlvu opened 9 months ago
For each existing Jupyter users, /data/user_workspaces/$USER have to be manually created
Otherwise this error in docker logs jupyterhub: [E 2024-01-16 15:30:36.478 JupyterHub user:884] Unhandled error starting lvu's > server: The user lvu's workspace doesn't exist in the workspace directory, but should have been created by Cowbird already.
This looks like the volume mounted as /data/user_workspaces
could be owned by root or some other user that the internal jupyter spawner user cannot get sufficient permissions to create the user-specific workspace, or that /data/user_workspaces/$USER
already exists, but has higher/root owner, such that jupyter cannot do the chown
command, and therefore Cowbird will fail any following step since it uses the same UID:GID
. Same applies for /data/jupyterhub_user_data/
and /data/jupyterhub_user_data/$USER
.
Just a wild guess. The order by which the volumes are created could be the source of the root owner. Since there is a step for jupyter persistence volume creation, it might not play nice with docker-compose configuration that would auto-create volume mount locations (as root) if they do not exist.
The creation is performed by this hook:
https://github.com/bird-house/birdhouse-deploy/blob/13645f324c1bcef3decd91ba8a5462862b1e8d5a/birdhouse/components/jupyterhub/jupyterhub_config.py.template#L147-L152 https://github.com/bird-house/birdhouse-deploy/blob/master/birdhouse/components/jupyterhub/jupyterhub_config.py.template#L173
Note that care should be taken with overrides if they play with similar properties: https://github.com/bird-house/birdhouse-deploy/blob/13645f324c1bcef3decd91ba8a5462862b1e8d5a/birdhouse/components/jupyterhub/jupyterhub_config.py.template#L259
This is the same issue as #392
{notebook_dir}/public
gets created in read-only mode{notebook_dir}/public/wps_outputs
can't be created later because the containing folder is read-only:I don't know why the Dockerspawner decides to create them in that order but that's how it's done consistently.
I don't know why the Dockerspawner decides to create them in that order but that's how it's done consistently.
I am happy it is consistent. The worst kind of problems are intermittent ones.
But I think the sequence is appropriate. {notebook_dir}/public
is the parent dir so it is volume-mounted first. Then {notebook_dir}/public/wps_outputs
volume-mount follows because it is the child dir. But since the parent dir is read-only, volume-mount of the child dir errors out because it can not create the mount point. This makes sense.
For each existing Jupyter users, /data/user_workspaces/$USER have to be manually created Otherwise this error in docker logs jupyterhub: [E 2024-01-16 15:30:36.478 JupyterHub user:884] Unhandled error starting lvu's > server: The user lvu's workspace doesn't exist in the workspace directory, but should have been created by Cowbird already.
This looks like the volume mounted as
/data/user_workspaces
could be owned by root or some other user that the internal jupyter spawner user cannot get sufficient permissions to create the user-specific workspace,
This is a reasonable hint but should not happen since the jupyterhub
container runs as root
so it can mkdir
and chown
all the paths it needs before spawning the Jupyterlab server container.
or that
/data/user_workspaces/$USER
already exists
No, the error happens only when that dir do not exist yet. If I manually create it before spawning the Jupyter server (which is my documented work-around), the error is gone and we can spawn the Jupyter server successfully.
The order by which the volumes are created could be the source of the root owner. Since there is a step for jupyter persistence volume creation.
No, Jupyterhub persistance data-volume is for the sessions tokens only. User data are not data-volume but direct volume-mount from disk.
For each existing Jupyter users, /data/user_workspaces/$USER have to be manually created
Isn't this just because the webhook action that creates the directory is only triggered when the user is created:
And the user is already created so the webhook isn't triggered (see: https://pavics-magpie.readthedocs.io/en/latest/configuration.html#webhook-user-create)
This code was added to consider the situation where the user already exists, and no webhook would be triggered. https://github.com/bird-house/birdhouse-deploy/blob/67c6ca1d22c47d9bdf6f6e239f808ef3ec9af0bb/birdhouse/components/jupyterhub/jupyterhub_config.py.template#L151-L155
I'm not sure why it doesn't resolve the same way as when the directory is manually created.
Could it be that jupyterhub
tries to mount the volumes before c.Spawner.pre_spawn_hook
gets called? Somewhat counter-intuitive name if so.
https://github.com/bird-house/birdhouse-deploy/blob/67c6ca1d22c47d9bdf6f6e239f808ef3ec9af0bb/birdhouse/components/jupyterhub/jupyterhub_config.py.template#L173
Does adding a mkdir
here fix it instead of raising?
https://github.com/bird-house/birdhouse-deploy/blob/67c6ca1d22c47d9bdf6f6e239f808ef3ec9af0bb/birdhouse/components/jupyterhub/jupyterhub_config.py.template#L161-L163
This code was added to consider the situation where the user already exists, and no webhook would be triggered.
This code (mkdir + chown) was there already before Cowbird was added to the stack and I can confirm it works fine on /data/jupyterhub_user_data/
. It is really odd that switching to /data/user_workspaces/
it does not work anymore.
Below old code with existing mkdir + chown: https://github.com/bird-house/birdhouse-deploy/blob/775c3b392813872cb8045be473d6e4b091d52d88/birdhouse/config/jupyterhub/jupyterhub_config.py.template#L53-L60
Is it possible Cowbird volume-mount /data/user_workspaces/
read-only which makes Jupyterhub unable to write to it? This is still weird since Jupyterhub has root access, it should be able to write to any paths it sees.
Does adding a
mkdir
here fix it instead of raising?
Or maybe adding a symlink instead, see this comment?
For each existing Jupyter users, /data/user_workspaces/$USER have to be manually created
Isn't this just because the webhook action that creates the directory is only triggered when the user is created:
And the user is already created so the webhook isn't triggered (see: https://pavics-magpie.readthedocs.io/en/latest/configuration.html#webhook-user-create)
Oh interesting. How does this hook knows to create a new dir or symlink to an existing /data/jupyterhub_user_data/$USER
dir?
The Magpie Webhook registered to occur on create_user
is sent to Cowbird's /webhooks/users
endpoint with event created
when the action happens (see https://pavics-magpie.readthedocs.io/en/latest/configuration.html#config-webhook-actions for all available Magpie Webhooks and when they trigger). Each active Cowbird handler
in https://github.com/bird-house/birdhouse-deploy/blob/13645f324c1bcef3decd91ba8a5462862b1e8d5a/birdhouse/components/cowbird/config/cowbird/config.yml.template that implements user_created
is then called. For the user-workspace, that happens here: https://github.com/Ouranosinc/cowbird/blob/e2aa5337e32cd87efb5600f3fe62882d8d4d8b1f/cowbird/handlers/impl/filesystem.py#L118
Does adding a mkdir here fix it instead of raising?
Yes that should solve the problem (when old users were created before cowbird was enabled)
We can solve the issue of having read-only volumes mounted on top of each other by changing the location of one or the other. I would recommend changing this line:
to:
#public_read_in_container = join(notebook_dir, 'public-shared')
Or similar.
I also think it would be a good idea to move this code out of env.local.example and into an optional component.
We can solve the issue of having read-only volumes mounted on top of each other by changing the location of one or the other. I would recommend changing this line:
to:
#public_read_in_container = join(notebook_dir, 'public-shared')
Or similar.
Yes, or export PUBLIC_WORKSPACE_WPS_OUTPUTS_SUBDIR=somethingelse
works and it can default to something else than public
. Note PUBLIC_WORKSPACE_WPS_OUTPUTS_SUBDIR
might already works properly. I just did not have time to confirm.
Same idea, both sharing solutions have their own public folder so they do not step on each other foot.
I also think it would be a good idea to move this code out of env.local.example and into an optional component.
Yes ! At the beginning, I thought about using this as a live example of how env.local
can be used to extend JupyterHub config. Retrospectively, it should have been an optional-component because it has been very useful for us, could benefits other.
Does adding a mkdir here fix it instead of raising?
Yes that should solve the problem (when old users were created before cowbird was enabled)
Should it be creating the dir or the symlink? See comment in code https://github.com/bird-house/birdhouse-deploy/blob/67c6ca1d22c47d9bdf6f6e239f808ef3ec9af0bb/birdhouse/components/jupyterhub/jupyterhub_config.py.template#L119-L120
Summary
Activating Cowbird with existing Jupyter users have many road blocks. This is in contrast with the usual "just enable the new component in
env.local
and it should play nice with all existing components" message we are trying to convey in the stack.A migration guide for system with existing Jupyter users would have been helpful.
Below are the various problems I faced so far and any work-around I was able to find. Will add more to this list as I try out Cowbird.
Details
For each existing Jupyter users,
/data/user_workspaces/$USER
have to be manually createdOtherwise this error in
docker logs jupyterhub
:[E 2024-01-16 15:30:36.478 JupyterHub user:884] Unhandled error starting lvu's server: The user lvu's workspace doesn't exist in the workspace directory, but should have been created by Cowbird already.
Conflict with the existing poor man's public share
If the poor man's public share in https://github.com/bird-house/birdhouse-deploy/blob/13645f324c1bcef3decd91ba8a5462862b1e8d5a/birdhouse/env.local.example#L377-L425 is enabled, then we have to set
PUBLIC_WORKSPACE_WPS_OUTPUTS_SUBDIR
inenv.local
to a different value thanpublic
.Otherwise this error when spawning a new Jupyterlab server:
Spawn failed: 500 Server Error for http+docker://localhost/v1.43/containers/2239816099ea7b8bf440b76fc0a1d4a43248bb1e5073fc043ef1c1062cdd3cff/start: Internal Server Error ("failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error mounting "/data/user_workspaces/public/wps_outputs" to rootfs at "/notebook_dir/public/wps_outputs": mkdir /pvcs1/var-lib/docker/overlay2/ec7672b5d034e55d21465dd1e41c0333e0c5db2adb2dcec9f0f2a37bb968fe10/merged/notebook_dir/public/wps_outputs: read-only file system: unknown")
.See https://github.com/bird-house/birdhouse-deploy/issues/392#issuecomment-1950252054
Content of
/notebook_dir/writable-workspace
for all existing Jupyter users seem to have disappearedThis is because without Cowbird enabled,
/notebook_dir/writable-workspace
is binded to/data/jupyterhub_user_data/$USER
. But with Cowbird enabled,/notebook_dir/writable-workspace
is binded to/data/user_workspaces/$USER
, which is a new dir that is empty.No work-around found so far.
To Reproduce
Steps to reproduce the behavior:
2.0.0
env.local
by uncommenting this section https://github.com/bird-house/birdhouse-deploy/blob/13645f324c1bcef3decd91ba8a5462862b1e8d5a/birdhouse/env.local.example#L377-L425writable-workspace
2.0.0
where Cowbird is enabled by defaultenv.local
, ex:./components/jupyterhub
Environment
Concerned Organizations
@fmigneault @ChaamC @Nazim-crim @mishaschwartz @eyvorchuk