pachyderm / pachyderm

Data-Centric Pipelines and Data Versioning
https://www.pachyderm.com/
Apache License 2.0
6.17k stars 567 forks source link

Cluster Users #3157

Open msteffen opened 6 years ago

msteffen commented 6 years ago

Background

Through v1.7.8 or so, Pachyderm has had a very relaxed model of users: there is no master list of a cluster's users, and any GitHub account is able to log in to any Pachyderm cluster (though arbitrary GitHub accounts aren't able to access anything).

There are several long-requested Pachyderm features for which we need to expand our auth model with respect to users (see list below).

Objectives

This list is still speculative, and will be culled as we do more work on this issue

Design

We'll have a new collection of users that we maintain. When a user signs into a cluster successfully, they'll be added to the list. When they perform an action, we'll note that (to meet goal #2 above, of knowing how active a user is—this may also converge with https://github.com/pachyderm/pachyderm/issues/3069 / https://github.com/pachyderm/pachyderm/issues/2468). To meet the goals of autocomplete, we'll have an API to access the list of users in a cluster. For both revocation and the user page, we'll need some kind of reverse-index mapping users to the set of cluster resources to which they have access.

TKCen commented 5 years ago

Did you consider to shift authz decisions to OPA? This would allow users to define access models fitting to their enviroment and policies. As OPA itself is document based and integrates with k8s nicely there are further integration possibilities, e.g.

This could also simplify the code overall, as authz decisions are made in a side car service.