det-lab / jupyterhub-deploy-kubernetes-jetstream

CDMS JupyterHub deployment on XSEDE Jetstream
0 stars 1 forks source link

Setup authentication #18

Closed zonca closed 4 years ago

zonca commented 4 years ago

After #17 is done, let's setup authentication. @pibion, what are your plans about that?

Do we want to use Github accounts? or do you have a third party authenticator CDMS uses?

pibion commented 4 years ago

@zonca would GitLab be possible? If not then Github could also work.

Our basic requirements are

  1. Anyone needs to be able to authenticate (so that when we run workshops non-CDMS members can use the resource)
  2. But we need a way to identify CDMS members and give them special privileges when using the data tools

We can add CDMS members to a Github or GitLab organization when they join. (We already do this to a limited extent).

As far as I know we don't use any third-party authenticator.

zonca commented 4 years ago

Yes, Oauthenticator supports gitlab. What organization do you plan to use? can you add zonca to that?

Then we need to make a more detailed plan about permissions.

Login

First, it seems a bit too wide to allow anyone with a gitlab account to be able to login, my proposal would be:

Special permissions

At the JupyterHub level we only have 2 types of users, admins and general users, maybe we can have CDMS users be admin, but first I need more details, exactly what do you mean by accessing the data tools? is it read/write permissions on some shared disk? or software?

pibion commented 4 years ago

We're currently using the SuperCDMS organization - I've sent an invite to you.

Your suggestion for a workshop_participants group should be manageable. I'll get something set up and report back.

We have a python module (@bloer do you know its name?) that queries a "data catalog" database and fetches data. Right now this requires no authentication and we rely on "it's hard to find" to keep our data from becoming public.

We'd like to make tutorial material that uses the same tools, but that means we'll need to change this tool to require some kind of credentials. We can set permissions on the directories at SLAC so that some are public (like data for tutorials) but others require group membership. Tina is the expert on what options we have to give different parts of our data set different permissions, I've asked her to weigh in.

bloer commented 4 years ago

The package is CDMSDataCatalog. Authentication is handled by means of a "secret" key in a config file that's shipped with the package.

If we wanted to be more secure, we could remove that file from the package and tell CDMS people they have to download the file to their home directories before they can use the catalog. I'm not a huge fan of adding extra steps to make everything work though.

If there is an easy way to identify non-cdms tutorial users, the data catalog client could inspect the user's group or something like that and select a different config file with more restricted permissions (and maybe even a different download path). Alternatively we could encrypt the "secure" config file, have the CDMSDataCatalog take an optional password in the constructor, and otherwise pick the default insecure config. Then we just have to make sure no one posts a notebook with the password written in it...

I'm not a huge fan of any of these approaches, so other suggestions are welcome.

bloer commented 4 years ago

Hmm...could we use ORCID for authentication? They support OAuth2. And everyone in CDMS should get an ORCID if they don't have one already

pibion commented 4 years ago

@bloer does ORCID support any kind of "group" association? If they do then I agree, requiring everyone to get an ORCID ID isn't "extra" work.

bloer commented 4 years ago

@pibion It doesn't look like there's an easy way to do that. So we'd have to keep a whitelist of member IDs

pibion commented 4 years ago

@bloer I'd like to avoid adding list maintenance unless it's absolutely necessary.

I'd recommend using GitLab. People who aren't analyzing won't need to create this account. And people who are analyzing need a Github/GitLab account anyway, for their thesis if nothing else.

zonca commented 4 years ago

ok, I am working on setting up Gitlab, unfortunately we are affected by this issue:

https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/1551

so we need to request "Write" API access just to authenticate people via Gitlab. Not ideal...

Anyway, for now I'll try to see if I can get it working.

zonca commented 4 years ago

it works with the new release of zero-to-jupyterhub, see #20 We still have the issue of needing too much permissions. Will investigate further.

Current permissions in the Gitlab App

image

zonca commented 4 years ago

This also is implemented using Gitlab, no advanced feature for members implemented yet, but possibly will find another method, it looks complicated to handle 2 level of permissions via JupyterHub