Zed --explain flag uses cached results

winstaan74 commented 2 weeks ago

What platforms are affected?

macos, others

What architectures are affected?

others

What SpiceDB version are you using?

v1.35.3

Steps to Reproduce

In Zed, the --explain flag makes use of cached results, even with the --consistency-full flag set. This makes it hard to get a full trace of how a permission decision was calculated. For example, running the same permission check twice gives differed answers each time -

❯ zed --insecure permission check --explain timesheet:1 read_timesheet user:123 --consistency-full                                              
4:00PM INF debugging requested on check
true
✓ timesheet:1 read_timesheet (2.347208ms)
└── ✓ engagement:3 read_timesheet (1.806125ms)
    ├── ⨉ engagement:3 supplier_for_attribute (994.125µs)
    ├── ⨉ engagement:3 manages_attribute (1.536667ms)
    └── ✓ engagement:3 self_attribute (1.711417ms)
        └── ✓ person:2 user (1.196667ms)
            └── user:123 

❯ zed --insecure permission check --explain timesheet:1 read_timesheet user:123 --consistency-full                                              
4:00PM INF debugging requested on check
true
✓ timesheet:1 read_timesheet (cached)
└── user:123

I see the same behaviour when using the grpc api from a java client with the 'debug' flag set. My desired behaviour is an explanation of the permission checking path that can be displayed to a user.

Expected Result

A full permissions check trace each time.

Actual Result

The explanation for the second permission check is minimal.

josephschorr commented 2 weeks ago

--consistency-full indicates that the most recent revision must be used. If you make two requests and there have been no writes to SpiceDB, then using cache is correct and --explain always returns the explanation of the real work performed by SpiceDB.

This is working as intended.

Can you expand on what you're trying to do?

winstaan74 commented 2 weeks ago

This is for a spicedb running locally, so I can confirm there's no writes in-between one zed --explain and the next.

On our production system, we'd like to be able to understand why a particular permissioning decision was made - and I think this is probably a common usecase. In some cases, we'd like to be able to display the explanation to an expert end user.

If a permissioning decision is 'incorrect' for us, it's usually the case that a user has something setup incorrectly - and we'd like to be able to identify the issue by looking at how spicedb followed relationships to make the permissioning decision.

If an explain request doesn't evaluate the whole graph, but makes use of previously cached results, then it becomes harder to understand the cause of a permissioning decision.

The current behaviour of --explain is great for understanding where time is spent in an actual permission check - but in some cases it would be helpful to force a full evaluation without using any caches.

josephschorr commented 1 week ago

If an explain request doesn't evaluate the whole graph

Explain won't evaluate the whole graph even if caching is not used; if a permission is granted via two paths, you may only get one or the other back, depending on which was found first.

It sounds like you want more of an "audit" ability, but that has significant performance implications since it would have to both bypass the cache and bypass short circuiting.

How often would you expect this feature to be used?

authzed / spicedb