authzed / spicedb

Open Source, Google Zanzibar-inspired permissions database to enable fine-grained authorization for customer applications
https://authzed.com/docs
Apache License 2.0
4.72k stars 251 forks source link

Add contextual relationships on CheckRequest #1398

Open claraisrael opened 1 year ago

claraisrael commented 1 year ago

Hello, I'm after the contextual tuples functionality expressed in Zanzibar that allows you to perform check requests with additional (context) tuples that aren't written into the system but rather only exist within the context of that particular request.

Similar to the Contextual Tuple offering from OpenFGA.

Note that this is different to the Caveats. Sometimes, I'd like to add imaginary relationships for some scenarios, and I want to avoid storing the relationship in the DB.

If this feature already exists, please let me know about it! Any leads will be highly appreciated.

josephschorr commented 1 year ago

Hi @claraisrael,

Can you expand on which scenarios you'd find contextual relationships useful?

The problem with injecting contextual relationships is that it breaks caching: any cached entry that relies upon them is no longer valid and that can significant hurt performance and scalability

ecordell commented 1 year ago

Can you describe what you'd like to use them for?

Contextual tuples seem interesting in the context of development / testing, but most other use-cases we've looked at have a much more cache-friendly implementation with caveats. Subproblems that use contextual tuples can't be cached (or they can be, but it's unlikely they'll ever be used again). A concrete example of what you want to do would be really helpful.

For what it's worth, contextual tuples aren't mentioned in the Zanzibar paper.

claraisrael commented 1 year ago

Yes, sure let me give you a more detailed explanation:

I am using the schema/authz model on (quite granular) data computed at runtime rather than being persisted and stored in a DB. So the data computed each time can change.

Say the data has resource and admin columns (the only ones that'd matter for adding relationships). And the permission will be allowed if I'm a member of the Team being admin of the resource. I'm allowed to view that resource. (Below is the simple schema).

So to put you an example, I first get this data at runtime; this could be 10000s rows:

Resource admin
1 Team A
2 Team B
3 Team C
4 Team B
5 Team B

And suppose that user Clara is a member of Team B.

I should write the following relationships into my model to check if the user:Clara can view resources 1,2,3,4.5 if the user:Clara is a member of team B.

resource:1#admin@team:A
resource:2#admin@team:B
resource:3#admin@team:C
resource:4#admin@team:B
resource:5#admin@team:B 

But the thing is that this helps if this data (table above) was static and persisted in a DB, but on a second runtime, I can obtain these other results:

Resource admin
1 Team B
2 Team C
3 Team C
4 Team A
5 Team B
6 Team B
7 Team C

And the relationships present in the SpiceDB wouldn't be accurate now.... Which means that I should be deleting, touching or creating the relationships again... So to perform each check of: can user:Clara view resource X... Instead, I would simulate that I add the relationships

resource:3#admin@team:C
resource:4#admin@team:B
resource:5#admin@team:B 
....

just to perform the check, but I would not want to store them in the SpiceDB.

My suggestion is that the check request looks like this:

 &pb.CheckPermissionRequest{
 Resource: resource 1,
 Permission: "can_view",
 Subject: user clara,
Contextual Tuples: {
//Add here all the computed tuples from the table above
resource:3#admin@team:C
resource:4#admin@team:B
resource:5#admin@team:B , etc
}
})

Schema

definition user {}

definition team {
    relation member: user

    permission view = member
}

definition resource {
    relation admin: team

    permission can_view = admin->view
}

Let me know if this is helpful, happy to jump on a call for more details.

vroldanbet commented 1 year ago

@claraisrael thanks for the information, out of curiosity:

claraisrael commented 1 year ago

Thank you for your replies and the active engagement ! @vroldanbet

vroldanbet commented 1 year ago

Happy to help!

It was decided not to store the resources table in the DB (exec decision based on my explanation above) - so due to that, it is not trivial to propagate the changes to SpiceDB. But I may be missing something, and would love to hear your suggestions.

Is the data derived out of something in runtime? Or is it input from your application, let's say a webapp with a form where you introduce all of this data? How big do you expect the set of contextual tuples to send?

The key here is to better understand how data is derived to see if it's something that could be implemented with a caveat instead of contextual tuples, given the scalability implications contextual tuples have. If information is derived from user input, that user input could also be modeled as caveats. But the way you described it, it sounds that the actual "granting of permissions over a resource" is what's derived in runtime, which is interesting but also puzzling 😅

Yes, that was a simple model, but in the real one, I have the concept of data sharing among teams, so I would store "friend/partner" relationships among teams.

Alright so you have an application where you have a hierarchy of entities like teams with members, but the bit that is not stored anywhere is who has access to a resource, and that is somehow computed in runtime, is that correct? So the idea of using spicedb as a almost stateless library wouldn't work here.

Can you elaborate more/send me a link on how to get started with serve-testing, please? How can I generate this token? Is it an arbitrary one?

We have some docs about serve-testing here. Please note this is not meant to be used in production, but based on my initial hypothesis that an stateless SpiceDB would work for you, I thought serve-testing could allow you to experiment. Also note that it's not entirely stateless, if you reuse the same token, previously written tuples will be there.

ecordell commented 1 year ago

I want to work backwards from your proposed check request:

&pb.CheckPermissionRequest{
    Resource: resource:1,
    Permission: "can_view",
    Subject: user:clara,
    Contextual Tuples: {
      // Add all the computed tuples from the table 
      resource:1#admin@team:A
      resource:3#admin@team:C
      resource:4#admin@team:B
      resource:5#admin@team:B
      // ...
  }
})

To make this request, you already know what team is "admin" for the resource you're asking about (it's right there in the contextual tuple list: resource:1#admin@team:A)

So all you really need to do is find out if user:clara is in the admin team for resource:1, which would be a simple check:

&pb.CheckPermissionRequest{
    Resource: team:A
    Permission: "member",
    Subject: user:clara,
})

Assuming a schema that looks like:

definition user {}

definition team {
  relation member: user
}

If you don't need to compute anything over the resources, there's no need to even pretend they're stored in SpiceDB via ContextualTuples.

Does that work?



This was my original example, before I realized you didn't want to store any references to the resources at all, so it's not a solution the question posed. I thought each resource would belong to a specific team, but which team was "admin" would change over time. Leaving it here as I think it's an interesting example anyway.

I took a stab at translating the problem into caveats:

definition user {}

caveat team_is_admin(self string, admin_teams list<string>) {
  self in admin_teams
}

definition team {
  // this is a self-relation that is only active if `team_is_admin` is true, 
  // which will only happen if the team is marked as an admin in the caveat context on the request
  relation admin: team with team_is_admin
  relation member: user

  permission view = admin->member
}

definition resource {
    relation team: team

    permission can_view = team->view
}

Example tuples:

// team definition - you can think of this as "team A is allowed to be marked as admin via a request"
team:A#admin@team:A[team_is_admin:{"self":"A"}]
team:B#admin@team:B[team_is_admin:{"self":"B"}]
team:C#admin@team:C[team_is_admin:{"self":"C"}]

// team members 
team:A#member@user:clara

// resources
resource:1#team@team:B
resource:2#team@team:C
resource:3#team@team:C
resource:4#team@team:A
resource:5#team@team:B
resource:6#team@team:B
resource:7#team@team:C

Then requests that set the admin team to match the user will succeed:

resource:4#can_view@user:clara with {"admin_teams": ["A"]}

and requests for users with non-admin teams will fail:

// resource is in team A, but team A is not admin
resource:4#can_view@user:clara with {"admin_teams": ["B"]}

// resource is in team B, which is admin, but user:clara is not in team B
resource:1#can_view@user:clara with {"admin_teams": ["B"]}

This example would be simpler if you could refer back to the relationship in the caveat, but currently that's not possible. That's why you have to write the self context when writing a team. This could be a good reason to consider adding support for directly referencing self in a caveat!

Here's a playground link if you want to play with it: https://play.authzed.com/s/KH0mgSgLPn96

claraisrael commented 1 year ago

Is the data derived out of something in runtime? Or is it input from your application, let's say a webapp with a form where you introduce all of this data? How big do you expect the set of contextual tuples to send?

There are two parts to this question, the data related to the resources and friendships. For the first one, the data is computed at runtime (call ABC). And we want to perform the check on this computed data, at runtime too. And this data ABC can be different in a subsequent run. So indeed, the information for the relationships are derived from this ABC set.

For the second one, this data is derived from input to my application through a webapp form.

Alright so you have an application where you have a hierarchy of entities like teams with members, but the bit that is not stored anywhere is who has access to a resource, and that is somehow computed in runtime, is that correct? So the idea of using spicedb as a almost stateless library wouldn't work here.

Exactly, also I would prefer to not store every single member for each team, preferably I would like to avoid storing those tuples and only add it at runtime (depending on who's requesting access at the time).

Thanks for the docs and for your responses!

claraisrael commented 1 year ago

thought each resource would belong to a specific team, but which team was "admin" would change over time. Leaving it here as I think it's an interesting example

Thanks so much! This was really helpful. This assumption is true, btw. Your solution to just checking if the user:clara is a member of the specific team might be a good solution to simplifying the problem! Although can you envision a solution which does not require storing members for each team? That can be quite a long list (but maybe the way to go). Do you think adding a caveat here for is_team_member is preferable? I think I can know the team for the user without having to store it in SpiceDB.

@ecordell with the concept of sharing among teams, would it be possible with your proposed solution, to impose rules on specific attributes of this resource (e.g. if the name of the resource matches with X) for sharing purposes.

xwrs commented 9 months ago

Hello, I am on this thread because I am trying to solve a bit different issue/use-case with contextual tuples. May be it worth creating a separate issue but I will put some thoughts here anyway.

We have a few teams which are building isolated subsystems. Each subsystem has it's own authorization model and will have it's own set of tuples in DB of FGA system (SpiceDB may be used only in some of subsystems).

Even though subsystem level FGA modules will have independent models, they may have a reference to outer/global authorization models. Let's say users can be added directly to some role in some department, but departments hierarchy lives separately. If we define one uber-model and uber-database with tuples it will be too complex and will have to cover all scenarios and edge cases for all subsystems in one schema.

Now the idea was to simulate a composite authorization modules (schemas and tuples) by having separate modules for subsystems and few global modules which cover different aspects of access control and extend access defined in each subsystem. So relations from these global modules may theoretically be passed to subsystem authz models as contextual tuples.

Maybe it makes no sense from the design prospective of FGA provided by SpiceDB but it is what we ultimately want to achieve

josephschorr commented 9 months ago

@xwrs Have you seen the proposal for composed schema? https://github.com/authzed/spicedb/issues/1437. I believe this may help you handle complexity, while still allowing for a unified set of data. In particular, we'd recommend having relationships that are per-subsystem be namespaced, so they are easier to track.

The fundamental issues with contextual relationships as you proposed are safety and performance: how do you efficiently (and correctly) make sure to provide the full set of relationships to SpiceDB?

xwrs commented 9 months ago

@josephschorr thank you for a feedback and challenging my plan) Issues with efficiency are obvious and yes, the use of contextual relations in our case is definitely a workaround.

I have reviewed the proposed schema and this functionality once implemented will definitely help to solve our task. Not sure if it offers independency of schema segments but I don't see it preventing such use.

For now we are building an API wrapper for FGA system and relations store, so that we can change the underlying implementation. Implementation itself will take some time and if it happens for the feature in the proposal #1437 will be implemented - we will integrate it into our solution