pangeo-data / pangeo-cloud-federation

Deployment automation for Pangeo JupyterHubs on AWS, Google, and Azure
https://pangeo.io/cloud.html
57 stars 32 forks source link

github users with hypen (-) in names get duplicate home directories on AWS hubs #405

Open scottyhq opened 4 years ago

scottyhq commented 4 years ago

We've noticed at geohackweek that github users with a hyphen in their user name are getting two home directories behind the scenes on our AWS EFS drive. Don't have time to dig into where exactly this is occurring in our NFS setup, but we noticed it because launching a dask cluster results in the expected home directory name but we get 2d showing up after a hyphen in a username.

For example GitHub user abar-bg gets both nasa.pangeo.io/home/abar-bg and nasa.pangeo.io/home/abar-2dbg on the EFS drive.

It seems like the jupyterhub and dask mapping of /home/jovyan to nasa.pangeo.io/home/${JUPYTERHUB_USER} is different.

https://github.com/pangeo-data/pangeo-cloud-federation/blob/staging/deployments/nasa/image/binder/dask_config.yaml#L41

https://github.com/pangeo-data/pangeo-cloud-federation/blob/staging/deployments/nasa/config/common.yaml#L42

Also related and maybe not yet documented elsewhere... we've noticed GitHub repos with hyphens in the name fail to launch on AWS binder.

Maybe related https://github.com/jupyterhub/kubespawner/issues/324

jhamman commented 4 years ago

@yuvipanda may know best here but I think this is actually a feature of KubeSpawner. I say that because escaping the hyphen is actually part of the test suite: https://github.com/jupyterhub/kubespawner/blob/8a6d66e04768565c0fc56c790a5fc42bfee634ec/tests/test_spawner.py#L292

The issue you linked to above does suggest removing this behavior so perhaps we chime in there.

yuvipanda commented 4 years ago

https://github.com/jupyterhub/kubespawner/pull/329#discussion_r305745331 explains why we do it in kubespawner, and probably why dask should too!

scottyhq commented 4 years ago

Thanks @jhamman and @yuvipanda for the pointers! I'm looping in @TomAugspurger and @jacobtomlinson b/c this is relevant to dask-kubernetes