Closed fhennig closed 2 weeks ago
Some notes:
It's a good idea to spike the policy stuff in pure Rego. We can use policy testing to test authorizer input.
For the multi-tenancy issue, I had the idea that we could use an allow rule in a cluster-specific rego package that then defers to a more generic package. The cluster specific package can then attach some context information about which cluster the request came from. for example:
package myDruid
import rego.v1
import data.druid
allow if {
druid.allow with input as {
"product": "druid",
"cluster": { # the name and labels are taken from the kubernetes metadata
"name": "my-druid",
"labels": {
"env": "dev"
}
},
"user": input.user,
"action": {
"resource": {
"type": concat("", ["druid-", lower(input.resource.type)]),
"name": input.resource.name,
},
"operation": lower(input.action)
}
}
}
The DruidCluster my-druid
can then reference the myDruid
OPA package, and there is also a druid
package that handles all Druid auth requests from all druid clusters.
related ticket: https://github.com/stackabletech/opa-operator/issues/494
We now have production ready rego rules for Trino and HDFS, closing this :rocket:
As a user of the SDP, I want to be able to manage my authorization policies in a fairly simple, maintainable and flexible way.
Current state
Currently we offer the UserInfoFetcher as well as OPA authorizers for a few products, but we do not have any guidance on how to actually write policies.
Expected outcome
Outcomes can be RegoRule templates that we recommend users to use as a starting point for their own rules,
it could also be a framework or library of RegoRules that we ship with the platform.We should also have a demo that showcases this. As we are working on this, we should also gain more knowledge about how to actually write sensible rules for the products, and find out more about what common policy definitions might look like.Step 1: Spikes, gather knowledge - plain OPA (no k8s)
We do not yet know enough about the products and their authorization models. We first want to spike some policies for each product to get a better understanding of how they all work, and then afterwards see what we can abstract away. For now, we are starting with HDFS and Trino. We also wanted to have a demo scenario that we can use as a reference when thinking about authorization and what we need to model.
What should the Rego data structures look like? We want to go in with little prerequisites and think about what works best for the product. For example for Trino we found it useful to allow the user to specify a similar data structure to file-based access control. The policies should support assigning access to individual users and groups. Users can model their organization in groups.
Questions that we should answer for each product:
For each product there is an OPA authorizer and we know the input that we get from the authorizer. Policy definitions should simply be tested in pure Rego.
UserInfoFetcher - we do not want to use the UIF yet. We can simply mock the UIF API.
Intermediate Acceptance criteria
Step 2: Build a demo to showcase the rules (and other context: Kerberos, OpenID, UserInfoFetcher)
Step 3: Deployment on the customer side
For now, since we only have two rule sets and no abstraction layer, we want to keep the rules as something users can deploy by themselves, and not automate the deployment. We can come back to automated deployment once we build an abstraction layer.
However the rules are still great starting points for customers, so we should publish them so users can use them. We want to keep the source of truth in the kuttl tests, and link to them from the documentation. There should be some explanatory documentation around the rules as well.
Follow-up work