Metadata to help cache individual authorizations

kjetilk commented 5 years ago

I'd like to submit my long-standing patch to WAC for your consideration: https://github.com/solid/web-access-control-spec/pull/37

Since I think cachability is one of the key success criteria for caching, I believe granular metadata to allow this to happen is really important. That is what this patch seeks to enable.

acoburn commented 5 years ago

I'd like to give a big 👍 to this proposal. This would be really useful metadata for cache implementations because it would provide a strong hint for TTL values.

dmitrizagidulin commented 5 years ago

While I totally see how this would be really useful for caching functionality, the bit I’m having trouble picturing is the UI/UX aspect. What does that look like, to the user?

“You’re sharing this document with Alice. Please don’t change your mind for the next 1day/1hr/5mins because we want to be able to cache it.”?

dmitrizagidulin commented 5 years ago

Would the cache-ability not be better addressed via ETags / If-Not-Modified, instead?

acoburn commented 5 years ago

I can describe my own use case a little more extensively, which may address @dmitrizagidulin's questions.

I would like a multi-tier, low-latency ACL cache that effectively pre-computes access controls across an entire server. The question then becomes one of cache invalidation, which can be driven through some sort of back-end notification system. At present, that's exactly how invalidation happens: when an ACL resource changes, the associated cached permissions are re-computed for affected nodes. The cache itself is lazy-populated, and currently the only way to do cache invalidation is via ACL document changes. This proposal would allow me to identify a specific TTL value for cached nodes, independent of resource changes.

I would see this additional metadata as strictly optional.

kjetilk commented 5 years ago

“You’re sharing this document with Alice. Please don’t change your mind for the next 1day/1hr/5mins because we want to be able to cache it.”?

No, this is about allowing for many different situations. Basically, there are three ways to manage a cache in current HTTP:

Get a commitment from the user for cachability into the future. This is RFC7234, and reflected with the dct:valid predicate.
Have a notification protocol that notifies you of any changes to explicitly invalidate a cache.
Use things like modification time and etags to only ask the server if the resource has changed, and use a cached copy if it hasn't. That's RFC7232, and the headers you mention.

There's no "instead" as I see it, they have all their use cases and utility, but I would argue this is roughly in decreasing utility, as in the first case, you don't even need to contact the server to reuse a cache, this ensures the lowest possible latency.

There are many cases where I could very well commit to a very long term ACL, e.g. an ACL that applies to my wife and kids. I'm committed to that not changing for as long as I live :-) But it could also be that I can only commit to a TTL of one minute, but still, if I had a million requests in that minute, the gain would be huge.

So, the UX could be "Would you be able to commit to not changing this for n time period"? And if not, the cache could rely on other things, such as conditional requests.

Also, I don't think we should think about this only in HTTP terms, for reasons I explain in the original post. We need more granularity, and the WAC elegantly provides us with a natural way to do that.

In @acoburn 's case, the implementation could rely on the notification protocol directly, but also on the first and third case indirectly.

solid / authorization-panel

Metadata to help cache individual authorizations #42