gravitational / predicate-lang

Predicate - Access Control System
Apache License 2.0
3 stars 2 forks source link

Extending predicate's DSL #29

Open vitorenesduarte opened 1 year ago

vitorenesduarte commented 1 year ago

I'm opening this issue to try to define a set of rules/guidelines that should be followed when extending predicate. The goal is that in the end we have'll a tool that feels consistent, predictable, without many "gotchas" (ideally none).

This discussion could be about different aspects of predicate, but in what follows I'll be focusing solely on the syntax/DSL.

As an example, both #24 and #26 have proposals to extend predicate's DSL. Since there are multiple design options, without some guidelines we could very easily not make the most appropriate choice.


What's written below assumes, for simplicity, that policies are not attached to users, i.e. that they apply to all users.

Let me start by introducing some terms that will be used later throughout the text.

Can identity I do action A on resource R?

Given a set of allow & deny rules we should be able to answer the above question.

So allow & deny rules define implicitly or explicitly the following three:

predicate's DSL

Let's now look at how we're specifying these allow & deny rules currently.

For example, we would write the following in order to allow users in the admin team have access to env=prod nodes:

# allow/deny users in the admin team to access env=prod nodes
Node(
    (Node.labels["env"] == "prod") &
    (User.traits["team"].contains("admin"))
)

I believe that this can be confusing because there are two Nodes in this predicate, and they mean different things:

To be less confusing, we could instead write the following:

# allow/deny users in the admin team to access env=prod nodes
AccessNode(
    (User.traits["team"].contains("admin")) &
    (Node.labels["env"] == "prod")
)

Now we can clearly see:

Implicit identity and/or resource

Consider the following two examples.

Example 1: To allow all users to have access to env=prod nodes, we would write:

# allow/deny all users to access env=prod nodes
AccessNode(
    (Node.labels["env"] == "prod")
)

Example 2: To allow all nodes to be accessed by users in the admin team, we would write:

# allow/deny users in the admin team to access all nodes
AccessNode(
    (User.traits["team"].contains("admin"))
)

Example 3: To allow all users to have access to all nodes, we would write:

# allow/deny all users to access all nodes
AccessNode(
)

In Example 1, the identity is not specified. In Example 2, the resource is not specified. In Example 3, the identity & the resource are not specified.

This is okay because the action AccessNode implies both the identity & the resource.

This is not the case for actions like Read, List or Write. In these cases we have to make the identity & the resource explicit.

Explicit identity and/or resource: a generic Rule

To have explicit identities & resources, we could add a generic Rule:

# allow/deny all users to read all nodes
Rule(
    (Rule.identity == User) &
    (Rule.action == Read) &
    (Rule.resource == Node)
)

As in Teleport the identity is always a Teleport user (I believe), we could make that the default, so that we would only have to write:

# allow/deny all users to read all nodes
Rule(
    (Rule.action == Read) &
    (Rule.resource == Node)
)

(The user would have to overwrite this default in case Teleport adds new identities in the future (e.g. bots).)

To allow/deny users to read only env=staging nodes, we would write:

# allow/deny all users to read env=staging nodes
Rule(
    (Rule.action == Read) &
    (Node.labels["env"] == "staging")
)

Because there's a restriction on Node, we can infer that the Rule.resource is Node, and thus it doesn't have to be specified explicitly.

I leave below some more examples:

Example 4:

# allow/deny all users to read/list all nodes
Rule(
    ((Rule.action == Read) | (Rule.action == List)) &
    (Rule.resource == Node)
)

Example 5:

# allow/deny all users to list all nodes and all k8s
Rule(
    (Rule.action == List) &
    ((Rule.resource == Node) | (Rule.resource == Kubernetes))
)

Example 6:

# allow/deny all users to read/list env=staging nodes and env=staging k8s
Rule(
    ((Rule.action == Read) | (Rule.action == List) &
    ((Node.labels["env"] == "staging") | (Kubernetes.labels["env"] == "staging"))
)

Irreducible rules

(This section is just a note, and not so relevant to the overall discussion.)

In the last 3 examples, we have rules that specify multiple actions & resources. This is because, in fact, each of these rules is reducible, i.e. it can be decomposed in multiple simple/irreducible rules (that contain a single action & identity & resource).

For example, Example 5 can be decomposed in the following 2 irreducible rules:

# allow/deny all users to list all nodes
Rule(
    (Rule.action == List) &
    (Rule.resource == Node)
)

# allow/deny all users to list all k8s
Rule(
    (Rule.action == List) &
    (Rule.resource == Kubernetes)
)

This can already happen in the current predicate's DSL, and seems to occur when:

Rules with implicit values (e.g. AccessNode) are syntax sugar for Rule

Below we take some of the AccessNode predicates from above and show how they could all be written using Rule.

# allow/deny users in the admin team to access env=prod nodes
AccessNode(
    (Node.labels["env"] == "prod") &
    (User.traits["team"].contains("admin"))
)

# or:
Rule(
    (Rule.action == AccessNode) &
    (Node.labels["env"] == "prod") &
    (User.traits["team"].contains("admin"))
)
# allow/deny users in the admin team to access all nodes
AccessNode(
    (User.traits["team"].contains("admin"))
)

# or:
# (note that `Rule.resource` is not specified as it can be inferred from `Rule.action`)
Rule(
    (Rule.action == AccessNode) &
    (User.traits["team"].contains("admin"))
)
# allow/deny all users to access all nodes
AccessNode(
)

# or:
Rule(
    (Rule.action == AccessNode)
)

Comparison with #24

24 introduces the following Resource object (which has similarities with the Rule object proposed above):

class Resource(ast.Predicate):
    namespace = ast.String("resource.namespace")
    kind = ast.String("resource.kind")
    labels = ast.StringMap("resource.labels")
    verb = ast.String("resource.verb")

This object allow us to write the following:

# allow/deny users to read nodes in the default namespace
Resource(
    (Resource.verb == "read") &
    (Resource.kind == "node") &
    (Resource.namespace == "default")
)

I believe this has some disadvantages when compared to Rule:

Comparison with #26

26 introduces the following JoinSession object:

class JoinSession(ast.Predicate):
    mode = ast.String("join_session.mode")

This object allow us to write the following:

# allow/deny users in the dev team to join a session as peer/observer on a env=dev node
JoinSession(
    (User.traits["team"].contains("dev")) &
    (Node.labels["env"] == "dev") &
    ((JoinSession.mode == "peer") | (JoinSession.mode == "observer"))
),

This JoinSession predicate contains a Node object, but shouldn't given the above reasoning. This is because the resource in a JoinSession action should be a Session, not a Node.

Given this, we should probably have a Session object containing a Node (i.e. the session target). This goes in line with what Joel suggested here, where it's also suggested that:

UPDATE: #26 has been merged with something similar to what's suggested above:

class Session:
    owner = User("session.owner")
    participants = ast.StringList("session.participants")
JoinSession(
    (User.traits["team"].contains("dev")) &
    ((JoinSession.mode == "observer") | (JoinSession.mode == "peer")) &
    ((Session.owner.traits["team"].contains("dev")) | (Session.owner.traits["team"].contains("intern")))
)

Actions can have properties

To allow/deny access to env=prod nodes using the "root" login, currently we write:

# allow/deny all users to access env=prod nodes using the root login
AccessNode(
    (Node.login != "root") &
    (Node.labels["env"] == "prod")
)

But login is not a property of Node, but of the AccessNode action. This is because login is a property that only makes sense when talking about accessing nodes, not e.g. when reading/listing nodes.

I believe that we should write instead:

# allow/deny all users to access env=prod nodes using the root login
AccessNode(
    (AccessNode.login != "root") &
    (Node.labels["env"] == "prod")
)

These action properties feel similar to IAM conditions.

We already have them with JoinSession.mode in the JoinSession action being introduced in #26.

UPDATE: This suggestion has been implemented in #44.

Final thoughts

The above reasoning is based on my limited understanding of Teleport RBAC functionality. It's very possible that some parts of Teleport RBAC not yet explored could invalidate the ideas above.

In any of this makes, it should probably become a live document that is updated as we keep adding support for more Teleport RBAC functionality.

I also have some ideas about Options and how they fit in the above, but I didn't have time to write them yet.

I should also say that I didn't explore how technically feasible are some of the above proposals.

xacrimon commented 1 year ago

@vitorenesduarte As far as I know, all resources do have a namespace. It's stored in the Metadata proto message.

xacrimon commented 1 year ago

Another note, in the PolicyMap resource prototyped by @klizhentas here we assume that we can somehow arrive at a set of polices that should apply to a given user (along with any explicitly set ones). I believe this is the right approach and that keeping policies totally global makes them rather difficult and expensive to evaluate, especially in larger clusters. All work I've done so far has been under this assumption. WDYT?

vitorenesduarte commented 1 year ago

@vitorenesduarte As far as I know, all resources do have a namespace. It's stored in the Metadata proto message.

Thanks for letting me know. I've corrected the issue description.

Another note, in the PolicyMap resource prototyped by @klizhentas here we assume that we can somehow arrive at a set of polices that should apply to a given user (along with any explicitly set ones). I believe this is the right approach and that keeping policies totally global makes them rather difficult and expensive to evaluate, especially in larger clusters. All work I've done so far has been under this assumption. WDYT?

I don't have any strong opinion on that. The assumption I made in the issue (i.e. "that policies are not attached to users, i.e. that they apply to all users.") was only to simplify the writing, but the ideas should apply even with policy mapping.

xacrimon commented 1 year ago

👍 One more note, the issue refers to JoinSession being added in #24 in multiple places but I believe its in #26

vitorenesduarte commented 1 year ago

👍 One more note, the issue refers to JoinSession being added in #24 in multiple places but I believe its in #26

Thank you. Just fixed them.

klizhentas commented 1 year ago

@vitorenesduarte I will review later this week, but at a quick glance AccessNode() should by default provide access to nothing and deny access to nothing. If someone makes a mistakes and types unspecified (empty) rule set, it should not grant access to anything. Thinking more, it should be error to have empty constructor for this action.

In example 1, predicate checker should return error, because user failed to specify principals in the set. We don't want to assume blanket access if the principal is missing. If users want to provide full access they should add regexp wildcard matcher explicitly. This will err on the side of security.

@vitorenesduarte in short, at a quick, 3 min, glance, in your examples, enforce integrity and err on the side of security. Let's never assume insecure defaults.

klizhentas commented 1 year ago

I like using explicit AccessNode and JoinSessions because those are specific constructors that can check that only adequate expressions are passed to them and contain relevant properties. However, introducing AccessNode does not really help with this:

AccessNode(
    (Node.labels["env"] == "prod") &
    (User.traits["team"].contains("admin"))
)

Labels is part of the Node, not AccessNode, similar to Node.logins. So really it will look like this:

AccessNode(
    (Node.login == "root") # this is a node login, not the verb
   & (Node.labels["env"] == "prod")
)

So we end up: Resources are for defining simple generic CRUD actions on resources. verbs - AccessNode, JoinSession, ReviewSession are domain specific verbs with special meaning for their domain. Let's see how it will work out.

vitorenesduarte commented 1 year ago

@vitorenesduarte haven't we come the full circle here from

# allow/deny users in the admin team to access env=prod nodes
Node(
    (Node.labels["env"] == "prod") &
    (User.traits["team"].contains("admin"))
)

To

# allow/deny all users to access nodes using the root login
AccessNode(
    (AccessNode.login == "root")
)

I don't think so. AccessNode is the action, and it's only in the context of that action that login makes sense (I believe). If we were to e.g. write a rule that allows listing nodes, I don't think it makes sense to restrict it based on Node.login.


To restrict AccessNode based on login and labels we would write:

AccessNode(
    (AccessNode.login != "root") &
    (Node.labels["env"] == "prod")
)

This is similar to what we have now for JoinSession (with #26), where mode is about the action, not the resource, and owner is about the resource, not the action:

JoinSession(
    (JoinSession.mode == "observer") &
    (Session.owner.traits["team"].contains("dev"))
),
klizhentas commented 1 year ago
AccessNode(
    (AccessNode.login != "root") &
    (Node.labels["env"] == "prod")
)

looks good to me

vitorenesduarte commented 1 year ago
AccessNode(
    (AccessNode.login != "root") &
    (Node.labels["env"] == "prod")
)

looks good to me

Great. Opened #44 with this change and also updated the issue description to be more clear.