nextpyp / next

Web-interface for nextPYP
https://nextpyp.app/
BSD 3-Clause "New" or "Revised" License
8 stars 1 forks source link

[FEATURE REQUEST]: The ability to use an external identity provider instead of the MongoDB database #1

Open jpellman opened 10 months ago

jpellman commented 10 months ago

Hi nextPYP Team,

I'm a systems administrator at an institution (NYSBC/SEMC) that is interested in using nextPYP with some regularity. We recently performed a test installation of nextPYP and came across a few architectural decisions that make it particularly unwieldy from an operational perspective.

One particular issue that we have with the software design is the fact that SLURM jobs run as a service account instead of the user who actually submitted the job. The main problem with this setup is that SLURM comes with an accounting system that we can use to gather statistics about usage. This accounting system isn't just used for reporting purposes, however; the database where job metrics is kept also can influence scheduling decisions (via fairshare) and resource limits. When all users run jobs as the same service account, this effectively blurs usage patterns from all users together, and makes it impossible to use SLURM's accounting functionality in any meaningful way.

At a small institution such as NYSBC, not being able to fully leverage the accounting functionality would not be too big of a deal, but at larger institutions (universities, the NIH, etc) this would likely be a bit of a showstopper. For some individual labs within a larger university, a potential workaround could be to use an individual grad student/postdoc's account as the service account, but this would come with the caveat that that individual's fairshare factor would be affected and their jobs would be unjustly deprioritized by the activities of their colleagues.

Ideally, instead of using a service account researchers would be able to log in as themselves using the same identity provider as the SLURM cluster. In this way, the identity provider and authentication functionality (which currently seems to be handled by MongoDB entries) would be decoupled from authorization. One could, for instance use LDAP as an identity provider and store authorization information (i.e., which identities can access which projects) in the MongoDB database.

A concrete example of this can be seen in how JupyterHub currently handles authentication (see here), where there are multiple authenticators that can be used with different identity providers. Multiple authenticator classes is probably overkill for nextPYP, but at the same time I feel like it would be useful to at least be offload authentication onto PAM.

I don't know how difficult what I described would be to implement on your end, but would be happy to chat some more if you find this feedback useful (or alternatively, if you would like some clarification about what I'm describing / asking for).

--John

cdienem commented 7 months ago

Hi nextPYP Team,

I want to second Johns feature request for a better identity management. We would run in the exact same issues with SLURM accounting as he described.

Additionally, the current implementation of a single service account running all processing requires that the nextPYP service account has access to all files on the file systems used for storing data and results. This is incompatible with user/owner permission rules preventing others to accidentally delete data. Loosening these rules is highly undesired in our case.

I hope you find a solution because this is a real problem once you want to use a software beyond a single workstation.

Best and thanks, Chris

cuchaz commented 3 months ago

Hi there. Thanks for the feedback! Sorry it took so long for us to get back to you, but for some reason, issue notifications weren't turned on for this repo, so we're just now seeing this issue.

We've heard feedback like this from lots of different sources now (including paper reviewers!) so hopefully you'll be happy to hear we've been working on a solution to it for a while now.

One part of the solution has actually been implemented for a long time, which is getting user accounts from some external authority. The website component of nextPYP supports receiving account information from HTTP headers included by a reverse proxy. We use this internally at Duke to support our SSO system. Other organizations could use this authentication mechanism too, but we haven't documented it much yet, so it's probably not very well known.

Implementing SSO for your organization should be fairly simple though. You just need a HTTP reverse proxy that implements your authentication method of choice and then sends an extra nextpyp-userid HTTP header to the nextPYP website along with every request. If other people have interest in this approach, we can write up more official documentation and help people with integrations, since the feature is already available today.

The second part of our solution is still under development, but we're finally starting to see some successes with it internally. We've devised a way for our website service account to read/write files and submit SLURM jobs on behalf of different users, securely. Our security design doesn't have the website process running as a privileged account, so this is much trickier for us to achieve than the usual techniques that involve privileged accounts and tools like sudo, but we're hoping our system will be very secure and hard to misuse this way.

There's a small bit of setup involved for each user, but the end result is that SLURM jobs appear as if they were submitted by a real user account rather than the service account. And the files produced by each job are owned by the real user account rather than the service account. This should help with storage quota systems and file access isolation as well as SLURM accounting.

We're planning on putting this feature into the next release, so you should hopefully be able to use this new system soon.