cedar-policy / rfcs

Apache License 2.0
10 stars 8 forks source link

Explicit Unspecified Entities #55

Closed khieta closed 5 months ago

khieta commented 8 months ago

Rendered

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

andrewmwells-amazon commented 8 months ago

We currently have no breaking changes slated for 4.0 that would break existing policies or schemas (only breaks to our API surface); it would be very cool if on the 4.0 release we could say "All of your 3.x policies / schemas will continue to work exactly the same, only some APIs have been changed". This RFC would spoil that by introducing a breaking change to the schema format for 4.0.

If this is a concern, I think we could do alternative A and provide a script to automatically migrate schemas.

aaronjeline commented 8 months ago

I personally like alt A, but I agree w/ craig about getting some feedback from users.

nakedible-p commented 8 months ago

I am in favor of this proposal. But there's some things to be said.

First of all, originally, I thought it was a good idea that unspecified entities is a special case. The downside of explicity unspecified entities is that it is easy to write a policy like:

permit(principal == Unauthenticated::"unspecified", action == Action::"SomeAction", resource == Res::"SomeResource")

The problem with this is that it's easy to write a policy that gives less access if you are authenticated than it gives if you are authenticated. Where as if the only way to match an unspecified entity is leaving it unspecified in the policy, then there's no risk of giving less access by authentication:

permit(principal, action == Action::"SomeAction", resource == Res::"SomeResource")

The other upside is for people coming from RDF background. An unspecified entity is very much like an RDF "blank node". But it is also a downside, as it isn't exactly like a blank node (a free variable).

However, I have since changed my opinion on the who matter. This is a very slight distinction, and it's not reasonable to expect people will think about this in enough detail to catch the distinction. Having unspecified entities behave exactly as other entities will simplify the brain process, simplify the API and force people to be more explicit about what is the same entity and what is not.

If "Alternative A" is chosen, possibly for just transitional use that will be deprecated, I would suggest __cedar::None. In my opinion, it's better than Null. An additional possibility would be __cedar::Blank, but I think that should be left for the possibility that Cedar at some point implements real blank node support.

And finally – I have been bitten by this problem myself. But it wouldn't have been any problem as long as Cedar would've documented the best practices for dealing with unspecified entities. In the end, the most important piece to ensure that you give guidance on the way to solve such issues.

khieta commented 8 months ago

Thanks for the comments @nakedible-p! I'll add None to the list of candidates for Alternative A. Also thanks for the call out about documentation -- that's definitely something we need regardless of whether we accept this RFC or not. Hopefully coming in an upcoming blog post!

mwhicks1 commented 7 months ago

I'm strongly in favor of Alternative A. Here is my reasoning:

I would add that I'd prefer something other than Ghost. I'd propose either None or Null.

One other thing: I'm not convinced that Application::"TinyTodo" is the same thing as the unspecified entity. Per Alternative A, the unspecified entity should never appear in policies. But Application::"TinyTodo" explicitly appears, for a reason. The unspecified entity is useful when principal, resource, etc. are irrelevant, and never need to be "looked at" in a policy.

khieta commented 7 months ago

Thanks for the feedback! After reading the replies from @nakedible-p and @mwhicks1, I'm starting to see why not being able to explicitly reference unspecified entities might be a good thing, since it encourages principal rather than principal == ... for a "made-up" entity type.

I also appreciate the comment from @cdisselkoen on the weight of a language-breaking change.

Based on the comments, I will be revising this RFC to propose Alternative A. Hopefully I'll have up a revised version later today!

khieta commented 7 months ago

⚠️ Major revision posted ⚠️

This RFC has been updated to propose what was originally Alternative A -- i.e., it no longer proposes removing unspecified entities from the Cedar language.


While revising this RFC, I took some time to think carefully about what bothers me about unspecified entities, and the list I came up with was: 1) unclear terminology, 2) confusion with "unknown" entities when constructing a Request, and 3) awkward syntax in the schema.

I then reworked the proposal with these criteria in mind. Here’s the TLDR of the new proposal, numbered based on which concern is addressed:

  1. Update the terminology surrounding unspecified entities, instead calling them default entities. Default entities will have type __cedar::Default.
  2. Define a new RequestEntity type to use as input to the Request constructor. The enum has variants EntityUid, Unknown, and Default.
  3. In order to specify that an action applies to a default entity, the schema appliesTo list must contain __cedar::Default.
nakedible-p commented 7 months ago

I think there's one distinction that is missed - and that's that I don't think in general __cedar::Default::"principal" == __cedar::Default::"principal" should be true. And unspecified principal should refer to some principal that isn't given in, but which should be distinct from all other principals, including other unspecified principals. If something stores the current principal as the owner of some object, and the current principal is unspecified, as is common in unauthenticated requests, then that doesn't mean that every unspecified request should own the object. Like mentioned before, there are equivalences to blank nodes in RDF. An unspecified principal should be a "free variable", like a blank node is – not something that is necessarily the same instance. But, of course, in some cases it does make sense to have have a distinct principal that is just one instance - so usages will differ.

Personally I will still use an explicit custom principal type for "unauthenticated" requests, and I will have the principal ID generated as ULID, so that it will never be equal to any previously generated principal. That means that personally for me it will not really matter how the semantics of the default principal will be set in Cedar. I would still suggest that Cedar will recommend people to always explicitly create their "default" principal types by themselves, and not use the __cedar::Default types except for compatibility with older stuff.

khieta commented 7 months ago

Re @nakedible-p:

I don't think in general __cedar::Default::"principal" == __cedar::Default::"principal" should be true. An unspecified principal should ... be distinct from all other principals, including other unspecified principals.

That's a good point. I agree that unspecified principals in different requests should refer to conceptually different entities.

Nevertheless, I think it is ok that all unspecified principals will end up being __cedar::Default::"principal" (under the current design) because there is no way for a principal from one request to interact with a principal from another request. We won't support storing anything with type __cedar::Default in the entity store or context, so it will not be possible to, for example, store an unspecified principal as the owner of another entity. Unspecified entities can only exist in a Request's principal/action/resource.

nakedible-p commented 7 months ago

If it's limited to only requests, then I also see no problems in having a single value like __cedar::Default::"principal". As a user, I'd expect anything starting with __cedar to be a bit special anyway. Thanks for thinking about this.

max2me commented 7 months ago

I will be a dissenting voice on this one as I strongly prefer Alternative A - getting rid of the unspecified entity type altogether in favor of enumerated entity types introduced in RFC53.

Both the main proposal and Alternative A are breaking changes, so I'd rather go a step forward and require customers to update schema, code, and policies once for the benefit of a brighter future (the main proposal would require customers to update just schema and code, yet all the same testing will have to occur).

My concerns with the main proposal:

In my opinion, occasional duplication of policies and the need to define your own default type is a small cost to pay for the overall simplicity and cohesiveness of the system.

shaobo-he-aws commented 7 months ago

But retaining the semantics of unspecified entities is disappointing, because it's confusing: A default entity is not truly ignored, as we might expect/hope, it is just a special, but second-class, kind of entity. Exposing this truth will further confuse users who are already confused: Why am I not allowed to write the default entity in policies? Why can't I create my own entities with that type? This relates to the lack of symmetry Max was talking about.

I agree. I think whenever we decide in the end, we should record the rationales in a FAQ or best practice doc.

khieta commented 6 months ago

Based on in person discussion, I have revised the RFC (again 🙂) to make dropping unspecified entities the main proposal. The general consensus in discussion was that the proposal to rebrand unspecified entities as “default” entities felt like a bandaid rather than a true fix, and we would eventually want to remove unspecified/default entities anyways.

However, to support backwards compatibility, for a period of time we will provide APIs (behind a flag) that maintain the prior behavior. See the new text for more details!

nakedible-p commented 6 months ago

I'm slightly happier with this proposal now than the one with __cedar::Default, as it was originally. Like said earlier, we will anyway define our own entity types for unspecified entities, so this just makes it explicit and simplifies the API. So :shipit:

khieta commented 5 months ago

The final comment period (FCP) for this RFC is starting now, with intent to accept 🎉

The FCP will end 2024-05-28 at noon PT / 3pm ET / 7pm UTC. Please add comments, and especially any objections, if you have any, before then. For more on the RFC process, see https://github.com/cedar-policy/rfcs.