bird-house / birdhouse-deploy

Scripts and configurations to deploy the various birds and servers required for a full-fledged production platform
https://birdhouse-deploy.readthedocs.io/en/latest/
Apache License 2.0
4 stars 7 forks source link

:bug: [BUG]: magpie allows special characters in usernames that mess up jupyterhub #393

Closed mishaschwartz closed 1 year ago

mishaschwartz commented 1 year ago

Summary

Jupyterhub creates jupyterlab docker containers named jupyter-{username} where {username} is the user name of the logged in user.

Magpie permits usernames with . in them but jupyterhub will substitute the . with -2e when creating the jupyterlab container. This means that a user with the name example.user will have a container named jupyter-example-2euser and the username according to jupyterlab will be example-2euser.

This causes problems since we assume that these names will align.

Details

When spawning a new jupyterlab container, it fails with the following traceback:

Traceback (most recent call last):
  File "/opt/conda/envs/birdy/bin/jupyterhub-singleuser", line 10, in <module>
    sys.exit(main())
  File "/opt/conda/envs/birdy/lib/python3.9/site-packages/jupyterhub/singleuser/app.py", line 68, in main
    return SingleUserNotebookApp.launch_instance()
  File "/opt/conda/envs/birdy/lib/python3.9/site-packages/jupyter_core/application.py", line 277, in launch_instance
    return super().launch_instance(argv=argv, **kwargs)
  File "/opt/conda/envs/birdy/lib/python3.9/site-packages/traitlets/config/application.py", line 1040, in launch_instance
    app.initialize(argv)
  File "/opt/conda/envs/birdy/lib/python3.9/site-packages/jupyterhub/singleuser/mixins.py", line 971, in initialize
    result = super().initialize(*args, **kwargs)
  File "/opt/conda/envs/birdy/lib/python3.9/site-packages/jupyterhub/singleuser/mixins.py", line 642, in initialize
    super().initialize(argv)
  File "/opt/conda/envs/birdy/lib/python3.9/site-packages/traitlets/config/application.py", line 113, in inner
    return method(app, *args, **kwargs)
  File "/opt/conda/envs/birdy/lib/python3.9/site-packages/jupyter_server/serverapp.py", line 2461, in initialize
    self.init_configurables()
  File "/opt/conda/envs/birdy/lib/python3.9/site-packages/jupyter_server/serverapp.py", line 1864, in init_configurables
    connection_dir=self.runtime_dir,
  File "/opt/conda/envs/birdy/lib/python3.9/site-packages/traitlets/traitlets.py", line 703, in __get__
    return self.get(obj, cls)
  File "/opt/conda/envs/birdy/lib/python3.9/site-packages/traitlets/traitlets.py", line 659, in get
    default = obj.trait_defaults(self.name)
  File "/opt/conda/envs/birdy/lib/python3.9/site-packages/traitlets/traitlets.py", line 1871, in trait_defaults
    return self._get_trait_default_generator(names[0])(self)
  File "/opt/conda/envs/birdy/lib/python3.9/site-packages/jupyter_core/application.py", line 107, in _runtime_dir_default
    ensure_dir_exists(rd, mode=0o700)
  File "/opt/conda/envs/birdy/lib/python3.9/site-packages/jupyter_core/utils/__init__.py", line 25, in ensure_dir_exists
    os.makedirs(path, mode=mode)
  File "/opt/conda/envs/birdy/lib/python3.9/os.py", line 215, in makedirs
    makedirs(head, exist_ok=exist_ok)
  File "/opt/conda/envs/birdy/lib/python3.9/os.py", line 215, in makedirs
    makedirs(head, exist_ok=exist_ok)
  File "/opt/conda/envs/birdy/lib/python3.9/os.py", line 215, in makedirs
    makedirs(head, exist_ok=exist_ok)
  [Previous line repeated 1 more time]
  File "/opt/conda/envs/birdy/lib/python3.9/os.py", line 225, in makedirs
    mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/notebook_dir/writable-workspace/.home'

I'm not entirely sure why the issue is a permission denied error but the cause is obviously due to an unexpected mismatch between the expected username and the username according to the container.

One possible solution would be to further restrict the valid usernames in magpie, either permanently or only when jupyterhub is enabled.

To Reproduce

  1. start birdhouse deploy with the magpie and jupyterhub components enabled
  2. create a user with a . in their name in magpie (eg: user.name)
  3. log in to jupyterhub as that user and spawn a new jupyterlab server
  4. inspect the jupyterhub logs for the error message: docker logs -f jupyter-user.name

Environment

Information Value
Server/Platform URL daccs.cs.toronto.edu
Version Tag/Commit 1.35.0
Related issues/PR
Related components jupyterhub, magpie
Custom configuration
docker version Docker version 24.0.2, build cb74dfc

Concerned Organizations

tlvu commented 1 year ago

Dupe of https://github.com/Ouranosinc/Magpie/issues/497. I have had this problem before so the current work-around is to manually remember to only use alphanumeric char in a username.

Being able to supply a username validation regex to Magpie would enforce this.

mishaschwartz commented 1 year ago

Dupe of https://github.com/Ouranosinc/Magpie/issues/497. I have had this problem before so the current work-around is to manually remember to only use alphanumeric char in a username.

Cool! Thanks I must have missed that one. I don't think that manually remembering to only use alphanum characters is a sustainable solution, especially once other organizations start to deploy the stack.

Being able to supply a username validation regex to Magpie would enforce this.

This is a good idea. I can work on this unless someone at CRIM has the capacity right now?

mishaschwartz commented 1 year ago

I'm going to suggest we don't close this (even though its a duplicate of the Magpie issue) since the solution will end up being a change to Magpie as well as a configuration change to the ini files in this repository as well.

fmigneault commented 1 year ago

I think the safest approach would be to use the user_id rather than user_name. This way, we never have to worry about either of Jupyter's or Magpie's allowed characters.

mishaschwartz commented 1 year ago

@fmigneault

Can you please clarify what you mean.

Where are we using user_id instead of user_name (which component)? What configuration files/variables do we need to change to do that?

tlvu commented 1 year ago

Where are we using user_id instead of user_name (which component)? What configuration files/variables do we need to change to do that? @mishaschwartz

I think @fmigneault means using the user_id in the container name (jupyter-123 instead of jupyter-NAMEHERE) and the folder on disk (/data/jupyterhub_user_data/123 instead of /data/jupyter_user_data/NAMEHERE).

For me this is very user unfriendly and error prone. Often we have to chase down which user is using too much resources and currently it's very easy, the user name is directly in the container name.

Also we have to keep an eye on the disk space usage for each user so again having the user name directly in the folder path make this much easier, since we have a lot of users.

When deleting the user, it's also easy to find and wipe the corresponding persisted data on disk since the user name is in the path name to wipe. With user id, there is a risk we could wipe the wrong data on disk.

mishaschwartz commented 1 year ago

Thanks @tlvu that makes sense.

I've added a proposed change to Magpie here (https://github.com/Ouranosinc/Magpie/pull/592) that should allow us to optionally restrict the user names.

If that looks good and gets pulled in, I can make a PR here to update the magpie.ini file.

fmigneault commented 1 year ago

For me this is very user unfriendly and error prone. Often we have to chase down which user is using too much resources and currently it's very easy, the user name is directly in the container name.

Also we have to keep an eye on the disk space usage for each user so again having the user name directly in the folder path make this much easier, since we have a lot of users.

I got an idea regarding that. Would it be possible to have system link in the form /data/jupyterhub_user_data/{user_id} -> /data/jupyterhub_user_data/{user_name}?

This way, a custom renaming/regex logic could be applied to obtain redirects for convenience, but real references would use the generic user IDs? You would have a extra step to check those links to figure out the mapping, but it would still be available somewhere.

The issue I forsee with custom regexes such as in https://github.com/Ouranosinc/Magpie/pull/592 is that the similar logic will have to be replicated everywhere that needs to resolve the directories. This means, all Weaver WPS output hooks, all Cowbird handlers, and so on. I think it would be wiser to abstract away that logic using integers.

I think https://github.com/Ouranosinc/Magpie/pull/592 can still be valid on its own. As an admin, you could ask to restrict certain user names, or even enforce some name conventions. However, I am not in favor of using this as a side-effect that assumes it would resolve all issues elsewhere. It feels more like a hack than a solution.

mishaschwartz commented 1 year ago

The issue I forsee with custom regexes such as in https://github.com/Ouranosinc/Magpie/pull/592 is that the similar logic will have to be replicated everywhere that needs to resolve the directories. This means, all Weaver WPS output hooks, all Cowbird handlers, and so on. I think it would be wiser to abstract away that logic using integers.

I'm not sure I understand this.

Right now, weaver WPS output hooks, cowbird handlers, etc. all use the username that is defined in Magpie to create directories. All we're doing is restricting the naming convention for those usernames. I don't understand how this affects weaver or cowbird.

fmigneault commented 1 year ago

I guess you're right. I was overthinking it. Usernames would be correct from upstream validation in Magpie. Just need to add a note in the changes that manual retrofit of all existing user names that do not conform to new naming convention will have to be done.

tlvu commented 1 year ago

I got an idea regarding that. Would it be possible to have system link in the form /data/jupyterhub_user_data/{user_id} -> /data/jupyterhub_user_data/{user_name}?

This means at deletion time, we will still need to match the user_id to user_name to delete both of them, still error-prone to delete the wrong not-matching pair.

Also the user_name is created in Magpie. So if the restriction is done at Magpie level, everything else that use the user_name will be automatically in sync. There are no logic about user_name validation to replicate.

add a note in the changes that manual retrofit of all existing user names that do not conform to new naming convention

I understand that the new optional validation regex only take effect for new user creation. I won't affect any existing users. So what "manual retrofit" are you referring to? Do you mean a DB migration of existing users? That should not happen because we will be changing their login-id and they won't be able to login anymore to Jupyter.

In fact, if they are already able to login to Jupyter, that means their naming convention is already matching the good one. Going forward we are simply able to properly enforce this naming convention.

fmigneault commented 1 year ago

Do you mean a DB migration of existing users?

Exactly. If the new regex is applied, and the function used to validate user properties is called, previously valid usernames would now cause API/UI errors. If their usernames were already patched to make them work Jupyter, then there shouldn't be any issue.

mishaschwartz commented 1 year ago

If the new regex is applied, and the function used to validate user properties is called, previously valid usernames would now cause API/UI errors.

The change was made to the check_user_info function which is only called in Magpie when a user is being created or updated. Magpie will still work as before for old users with invalid usernames, they just won't be able to log in to jupyterhub (but they couldn't do that before).

I don't think there will be any old users with invalid names to migrate. If there were old users with usernames that broke jupyterhub you would know already because they would be complaining that they can't use jupyterhub.

tlvu commented 1 year ago

For legacy reasons, we have Magpie users before the time of Jupyter so these users do not use Jupyter so no warranty that their usernames conform to the naming convention.

Is there a way to make this additional regex check apply only for new user creation but not when updating existing user (ex: reset their password because too long ago they forgot it or updating their email). Like the admin password 12 char length enforcement, it only trigger if we actually set a new password. If we set again but to the same existing password, the enforcement is bypassed.

Else can you document the DB migration steps we can run manually only for a very small list of problematic users?

fmigneault commented 1 year ago

Is there a way to make this additional regex check apply only for new user creation but not when updating existing user

It is the same function that performs the check in both cases. https://github.com/Ouranosinc/Magpie/blob/92ff2d2fa1e62b69a3a5edf22ef65d61238eaba9/magpie/api/management/user/user_utils.py#L124 https://github.com/Ouranosinc/Magpie/blob/92ff2d2fa1e62b69a3a5edf22ef65d61238eaba9/magpie/api/management/user/user_utils.py#L241

However, unless the user name is updated, the email/password should be checked on their own without revalidation of the user name.

tlvu commented 1 year ago

However, unless the user name is updated, the email/password should be checked on their own without revalidation of the user name.

Super, so no need to DB migrate the existing users with "bad" username format.

Users sometime forget their password so we simply set a new one. We never had to change a username before.

mishaschwartz commented 1 year ago

@fmigneault Thanks for merging in https://github.com/Ouranosinc/Magpie/pull/592

Unfortunately, while testing the magpie/jupyterhub interface further it looks like the fact that jupyterhub also converts all uppercase characters to lowercase is a problem.

Magpie currently only does case insensitive regex comparisons so there's no way to restrict usernames to lowercase only. In order to allow us to enforce this, I've added another PR to Magpie here (https://github.com/Ouranosinc/Magpie/pull/594). Please have a look when you have a minute