Cluster Users - Githubissues

Background

Through v1.7.8 or so, Pachyderm has had a very relaxed model of users: there is no master list of a cluster's users, and any GitHub account is able to log in to any Pachyderm cluster (though arbitrary GitHub accounts aren't able to access anything).

There are several long-requested Pachyderm features for which we need to expand our auth model with respect to users (see list below).

Objectives

This list is still speculative, and will be culled as we do more work on this issue

[ ] Display a Pachyderm cluster's list of users
- Why Clients are paying per-user, so they should be able to see how they're using their budget
[ ] Display how "active" a user is
- Why If an account isn't consistently using a Pachyderm cluster, then it shouldn't affect the client's user count
[ ] Revoke a user's access to an entire cluster/all resources in a cluster
- Why For clusters using GitHub-based-auth, an admin currently needs to manually remove a user from every resource to which they have access if the admin stops trusting a user. This feature would make that situation easier to handle.
[ ] Auto-complete in the ACL modification UI
- Why I think this is a particularly valuable stretch goal—it's very easy to mistype a user's name and not grant them access to some protected resource (or worse, grant the wrong person access). By giving users auto-complete, we'll avoid a lot of confusion/complaints stemming from invalid ACLs.
[ ] Display a profile page for Pachyderm users when a user icon is clicked (rather than linking to a GitHub profile, which may not exist in the case of SAML users or robot users). Ideally this will show the resources to which a user has access and the user's group memberships
- Why For auditing, primarily
[ ] In the case of GitHub clusters, admins should be able to prevent untrusted accounts from signing in to their cluster
- Why If a cluster's DAG itself is sensitive, then this feature will allow cluster admins to use GitHub authentication without taking the risk that untrusted users will see their DAG. Note that this is similar to #3 above; if a user becomes "untrusted", then they should lose their ability to see a cluster's DAG, in addition to losing access to the cluster's data.

Design

We'll have a new collection of users that we maintain. When a user signs into a cluster successfully, they'll be added to the list. When they perform an action, we'll note that (to meet goal #2 above, of knowing how active a user is—this may also converge with https://github.com/pachyderm/pachyderm/issues/3069 / https://github.com/pachyderm/pachyderm/issues/2468). To meet the goals of autocomplete, we'll have an API to access the list of users in a cluster. For both revocation and the user page, we'll need some kind of reverse-index mapping users to the set of cluster resources to which they have access.

pachyderm / pachyderm

Cluster Users #3157

Background

Objectives

Design