Explain the ACL inheritance algorithm for clients

dmitrizagidulin commented 5 years ago

Why not have the server specify the inherited ACL in a header? Figuring out the parent acl is something the server does on almost every request, it’s a core capability. Why have the client make all these http requests?

RubenVerborgh commented 5 years ago

Why not have the server specify the inherited ACL in a header?

That could be a very nice SHOULD; I'd support that.

kjetilk commented 5 years ago

I agree, we should have a Link to that as well. Note, however, that this issue is about the client's use of the inheritance mechanism, and the client should only need to find that on writes, otherwise it is the server's job to enforce this. This was an insight @timbl gave us yesterday.

zenomt commented 5 years ago

other than specially-privileged permission management apps that have acl:Control, what clients need to know a resource's ACL URI?

for a parent/inherited ACL, it might be of interest to know its URI, but you'd presumably also want to know the URI of the parent container it goes with (it could be several levels up), so that if you were a permission management app, the user could make an informed decision as to how to appropriately adjust that container's permissions.

i hope and expect it'll be rare for random apps/clients to have acl:Control permission for any resources, particularly because it'll be Hard for random apps to understand and apply the user's potentially complex access policy intentions.

csarven commented 4 years ago

[Revisiting this proposal even though I was in the original meeting for it.]

We currently have consensus on the following:

Server associates an ACL to a resource. There is no URI template for an ACL that a server needs to apply. The only implied requirement is due to hierarchical nature of the resources ie. if a resource doesn't have its own ACL, its parent container's ACL is applied (repeat going up the hierarchy until root container/Storage's ACL is findable). So eg. /foo/bar 's ACL can be /foo/bar.acl , /foo/.acl , /.acl , but not /baz/qux.acl or /baz/.acl.
Control access privilege on a resource is required in order to discover its ACL.

Hence, it is currently possible for a server to return the effective/inherited ACL of a resource. No algorithm required for a client other than to check rel=acl in resource's Link header.

Aside: If I'm not mistaken, there is a historical reason to why the inheritance algorithm exists... It is partly due to the way NSS worked - it happened to automatically associate an ACL to each resource - whether that ACL actually existed as an independent resource or one determined based on an inheritance algorithm was another matter. From that point, we can see why the inheritance algorithm would be required.

Proposal: Unless there are security implications we should consider, servers MUST only return the effective ACL for a given resource ie. Link rel=acl as advertised. No new requirements needed for clients.

RubenVerborgh commented 4 years ago

servers MUST only return the effective ACL for a given resource

How can a client then know how/where to PUT the specific ACL?

acoburn commented 4 years ago

servers MUST only return the effective ACL for a given resource

How can a client then know how/where to PUT the specific ACL?

In fact, it is worse. If a client treats URIs as opaque, believing that the effective ACL applies only to a particular resource (when in fact it applies to an entire hierarchy of resources), doing a PUT or PATCH on that (effective) ACL resource will affect the ACLs of every other resource that currently inherits from that ACL. Imagine if the effective ACL is the root resource and a client really only wanted to change the ACL of a single child resource.

This is fine if that is the intention of a client, but if there is no way of knowing whether such an operation will affect one resource or many resources then the security implications of that change will be terrible.

csarven commented 4 years ago

As mentioned, shared slash semantics and hierarchical paths limit the possibilities eg. "/foo/bar 's ACL can be /foo/bar.acl , /foo/.acl , /.acl , but not /baz/qux.acl or /baz/.acl."

If a client wants to PUT a specific ACL for /foo/bar, it'll have to be under /foo/ or / (but not any other branch).

acoburn commented 4 years ago

shared slash semantics and hierarchical paths limit the possibilities

You are building a security model on apriori knowledge of this by every current and future client.

csarven commented 4 years ago

That knowledge is already required by the spec. Figured that what's possible stems from that.

Rethinking this.. the issue is not so much of PUTing an ACL but that of associating an ACL with a resource. So, I'm okay to put a pause on that proposal :) It was just another way of not requiring clients to hunt down an ACL or having a server expose yet-another location for an ACL (the "actual" and the "effective")

Edit: associating an ACL to a resource can be done with the LINK method provided that the target location is allowed.

csarven commented 4 years ago

If a client treats URIs as opaque,

That conflicts with the requirement for clients and servers sharing the slash semantics in URI path. Can't expect clients to conform to that as well as decide to treat URIs as opaque entirely. The latter is currently out of spec behaviour, unless of course exceptions to the requirement are introduced.

believing that the effective ACL applies only to a particular resource (when in fact it applies to an entire hierarchy of resources), doing a PUT or PATCH on that (effective) ACL resource will affect the ACLs of every other resource that currently inherits from that ACL. Imagine if the effective ACL is the root resource and a client really only wanted to change the ACL of a single child resource.

This is fine if that is the intention of a client, but if there is no way of knowing whether such an operation will affect one resource or many resources then the security implications of that change will be terrible.

True and those are valid concerns. Note however that updating an ACL policy containing acl:accessTo <./> and acl:default <./> by definition affect the resources relying on them. Client must be able to understand accessTo, default, among other things in order to do anything with them in the first place. This holds true for all clients updating ACLs.

I do however see another issues with "servers MUST only return the effective ACL for a given resource":

What's on the table:

Clients use the inheritance algorithm to discover a resource's inherited ACL.
Servers advertise a resource's inherited ACL independently of its specific ACL.

What's attractive about the client figuring out the inherited/effective ACL is that once a container with an ACL is found, the relation between the container resource and its ACL is clear in conjunction with the policy descriptions in container's ACL.

That clarity is lost if "servers MUST only return the effective ACL for a given resource" ie. Resource rel=acl ACL, but when clients read that ACL's policies, relative paths for the values of accessTo and default can potentially mislead the client. Hence, a different link relation (eg using acl:inheritedACL) is needed to make the distinction from rel=acl with a different understanding on handling the inherited ACL.

The two approaches are not mutually exclusive (provided that they yield the same ACL). Clients are generally capable of using the inheritance algorithm given the base requirements (Aside: note also the similarity to traversing the URI path to find the controller of a storage as proposed in https://github.com/solid/specification/issues/153#issuecomment-624630022 and there are some ways to optimise the checks). Servers using two different relations for ACLs seems to be a bit of a bloat but I have no strong objections.

In context of server using two ACL relations (rel=acl, rel=acl:inheritedACL) with fixed URLs (as per resource lifecycle and hierarchical containment), updates are a non-issue. It becomes an issue if the target of the link relation is not persistent, and where clients need to associate an ACL to a resource. While that's currently not possible, it can be worked out by adopting LINK and UNLINK methods.

RubenVerborgh commented 4 years ago

OK so it seems that I was thrown off by terminology.

Am I correct that:

"Effective ACL" ≠ "ACL currently in effect" (which might be the inherited one)?

If so, can we call this something like "resource-specific ACL"?

csarven commented 4 years ago

No. At least I was using "effective ACL" as equivalent to '"ACL currently in effect" (which might be the inherited one)?'

HEAD /resource

200
Link rel=acl /resource.acl

/resource.acl is always the specific ACL for /resource .

If HEAD /resource.acl returns 200, it is the effective ACL.

If HEAD /resource.acl returns 404, and client finds /.acl, that'd be the effective - the inherited - ACL.

RubenVerborgh commented 4 years ago

Ok in that case I am horribly confused, because the initial proposal was

servers MUST only return the effective ACL

And let's use a deep example /foo/bar/resource.

So here is what I know think we are saying:

acl Link relation always points to most specific possible ACL
- e.g., always /foo/bar/resource.acl or whatever it chooses as a naming convention
When a client accesses any resource or ACL, the server should either only return that specific thing, or 404.
- e.g., when I GET /foo/bar/resource.acl, it can only return /foo/bar/resource.acl, never /foo/.acl

If that understanding is correct, then I would suggest a different phrasing for the proposal (I can help).

csarven commented 4 years ago

servers MUST only return the effective ACL

didn't distinguish between specific and inherited. It is literally the only effective ACL using rel=acl. It is "specific" within the context of what's effective. If server determined that /resource.acl exists, that'd be the effective and so it would return that in Link of /resource. If /resource.acl didn't exist, and the server determined that /.acl is inherited (and so effective), then it would only return /.acl in Link of /resource

That approach is different than rel=acl strictly being used for resource specific ACL ie. having only /resource.acl target (but never /.acl as you've said above). Links of /resource would be rel=acl /resource.acl and rel=inheritedacl /.acl (if /resource.acl didn't exist).

Does that clarify:

What's on the table:

Clients use the inheritance algorithm to discover a resource's inherited ACL.

Servers advertise a resource's inherited ACL independently of its specific ACL.

RubenVerborgh commented 4 years ago

didn't distinguish between specific and inherited. It is literally the only effective ACL using rel=acl. It is "specific" within the context of what's effective. If server determined that /resource.acl exists, that'd be the effective and so it would return that in Link of /resource. If /resource.acl didn't exist, and the server determined that /.acl is inherited (and so effective), then it would only return /.acl in Link of /resource

Understood.

Then my PUT problem from above remains.

Steps:

Have an ACL at /foo/.acl
Have a resource at /foo/bar/resource
GET /foo/bar/resource has Link rel=acl with /foo/.acl according to the proposal.
I want to create an ACL specifically for /foo/bar/resource, but I don't know where to PUT it. I can't rely on the .acl convention.

Unless you tell me that .acl is spec. Then the role of rel=acl is indeed to link to the applicable ACL.

csarven commented 4 years ago

I want to create an ACL specifically for /foo/bar/resource, but I don't know where to PUT it. I can't rely on the .acl convention.

PUT is not an issue in and itself because of:

As mentioned, shared slash semantics and hierarchical paths limit the possibilities eg. "/foo/bar 's ACL can be /foo/bar.acl , /foo/.acl , /.acl , but not /baz/qux.acl or /baz/.acl."

So, yes, you can create /foo/bar/resource.acl (or whatever naming) as long as the hierarchy holds. What's not possible is actually:

associating an ACL with a resource

because we don't have a way to connect / update the relation with the new ACL that's created. It can be remedied with

by adopting LINK and UNLINK methods

if that direction is so desired.

".acl" convention is not part of the spec.

RubenVerborgh commented 4 years ago

What's not possible is actually:

associating an ACL with a resource

because we don't have a way to connect / update the relation with the new ACL that's created. It can be remedied with

by adopting LINK and UNLINK methods

if that direction is so desired.

…or just by always rel=acl linking to the most specific ACL? Then nothing breaks., hierarchy traversal works.

It's so simple and doesn't involve methods that are not really used elsewhere. What are the drawbacks?

acoburn commented 4 years ago

Hierarchy is not relevant here. The examples all presume that acl URIs can be created with (resource URL + ".acl"), which does not hold universally.

Using @RubenVerborgh's example, if a client wants to create an ACL for /foo/bar/resource, the client would need to understand the server's specific naming convention in order to know where to PUT it. Which URL would it use? And how can it be certain?

/foo/bar/resource.acl /foo/bar/resource.webacl /foo/bar/resource.webac /foo/bar/resource?acl /foo/bar/resource?webac /foo/bar/resource?authz=acl /foo/bar/resource?acl=fZF2O3xsm

Or some other server-specific mechanism. The possibilities are endless.

The unambiguous way to inform a client where to PUT an acl for the current resource is the mechanism that is currently in use: a Link header. It would be fine to add an additional header to inform a client of the location of the ACL that currently controls access for the resource, but it should be a separate link relation.

RubenVerborgh commented 4 years ago

The examples all presume that acl URIs can be created with (resource URL + ".acl"), which does not hold universally.

Was not my assumption actually. I would a) link to the most specific ACL, always and b) traverse up and ask every resource for their ACL.

acoburn commented 4 years ago

The examples all presume that acl URIs can be created with (resource URL + ".acl"), which does not hold universally.

Was not my assumption actually.

I was unclear. Some of the examples make that assumption.

I would a) link to the most specific ACL, always and b) traverse up and ask every resource for their ACL.

Exactly, and that's what I would expect, too (though I like the idea of adding a separate link header to make this traversal easier for clients)

csarven commented 4 years ago

I'm going to skip further discussions around what the examples use because virtually any example will not hold universally. We have already established that there is no naming convention. ".acl" was just a simple way to denote the ACL. Could've used /foo/resourceACLFINAL11eleven?acl No one needs to care about ".acl" or "?acl" suffix.. because what matters is finding the ACL resource through rel=acl. Any way, I think we are on the same page on this. I hope! :)

The slash semantics is the only convention that can be expected because hierarchy is relevant. There wouldn't be an inheritance algorithm for clients if clients threat URIs as opaque.

How to PUT the ACL is not the issue as long as it either matches a predetermined location (server exposing Link rel=acl ACL) or the client creates the association at allowed locations.

if a client wants to create an ACL for /foo/bar/resource, the client would need to understand the server's specific naming convention in order to know where to PUT it. Which URL would it use? And how can it be certain?

No. It will simply check /foo/bar/resource's Link rel=acl and PUT to target. If rel=acl is not available, as long as the ACL URL is under /foo/bar/ or /foo/ or / (but no sub-directory other than /foo/ or /foo/bar/) they can create that in whatever naming contention. That's required for the inheritance algorithm to work. Can't PUT the ACL at random locations because it wouldn't be associated with a resource. It requires the client to take an additional step to make that association or possibly have the server figure it out by inspecting the payload of ACL policy's accessTo target and doing it automagically - not completely ridiculous I think. If rel=acl is available, we already know where to PUT - naming is irrelevant.

Was not my assumption actually. I would a) link to the most specific ACL, always and b) traverse up and ask every resource for their ACL.

Exactly, and that's what I would expect, too (though I like the idea of adding a separate link header to make this traversal easier for clients)

As I understand it, that's the original proposal for this issue. It is based on NSS's history (as mentioned earlier) in that it associates an ACL to each resource when it is created. Traversal is required because the associated ACL may not actually have a representation.

The important bit here is that server has a predetermined and fixed rel=acl with its target ACL when the primary resource is created.

The "drawback" was whether a client needs to "hunt down an ACL". That's two calls per parent container (1. container resource 2. its acl) in the hierarchy. I'm not making a judgement. The intention was to contrast with the other options.

As said in https://github.com/solid/specification/issues/106#issuecomment-630769483

I have no strong objection to an additional header but don't consider it to be a great approach either.

I think the confusion around PUT was because of having one rel=acl referring to the "effective" ACL (where the target is either the specific ACL or the inherited).. and that if it so happens to be the inherited, how would one create the specific. Again, creating the ACL is not the issue because the server shouldn't care (we follow rel=acl remember?) - unless of course it wants to stick to particular URI Template for ACL resources. Association is the challenge - with two possible solution: 1) LINK method, 2) server inspects payload and creates the association (as mentioned earlier).

RubenVerborgh commented 4 years ago

I don't understand.

Could you please solve this case for me?

Resource at /foo/bar/resource.
ACL at /foo?aclz holding for all of /foo.
No other resources or ACLs exist.
I want to set an ACL on /foo/bar/resource and only that resource.

I'd like the steps for 4 with your proposal. It's a common case in Databrowser.

csarven commented 4 years ago

Resource at /foo/bar/resource.

I assume that /foo/bar/resource doesn't have a Link header with rel=acl .

I want to set an ACL on /foo/bar/resource and only that resource.

I assume that ACLs must be valid and so servers must validate ACL payloads. For example, a policy's acl:accessTo must be /foo/bar/resource .

Examples with different request targets using Link header with rev=acl (reverse relationship):

PUT /resourceacl
PUT /dubnobasswithmyheadman?acl
PUT /foo/resourceDOTacl
...

PUT /foo/bar/resourceacl
Link: </foo/bar/resource>; rev="acl"

201
Location: /foo/bar/resourceacl

HEAD /foo/bar/resource

200
Link: </foo/bar/resourceacl>; rel="acl"

The anchor attribute may be another way to associate /foo/bar/resource and /foo/bar/resourceacl .

I don't think the LINK method is ideal for ACLs because PUT needs to happen before. PUT without headers indicating that the payload is an ACL will append a resource to container.

RubenVerborgh commented 4 years ago

In your example, it seems to be the client choosing the ACL URL. Whereas I thought we wanted this to be the server.

Because the consequence of letting the client do it, is that the server now needs bookkeeping. Whereas otherwise, it could just maintain a simple logic or association.

csarven commented 4 years ago

Right. Like I said, PUT is not the issue but the association required from client.

I wanted to show how it could work with a single relation and assuming that server didn't have a resource specific ACL associated. If server had advertised rel=acl, then updates can happen as usual (still requiring client association). Even in the case of advertising the inherited Link: </foo?aclz>; rel="acl" , PUT with Link rev could work. Server will then return /foo/bar/resourceacl (instead of the inherited /foo?aclz) in Link.

We have:

Clients use the inheritance algorithm to discover a resource's inherited ACL.

Servers advertise a resource's inherited ACL independently of its specific ACL.

eg. both rel=acl (always resource specific) and rel=inheritedacl (always inherited but only need to be revealed if resource specific doesn't have a representation)

RubenVerborgh commented 4 years ago

So as it stands, I would still support rel=acl always linking the most specific ACL, 404 or not.

That way, we can do everything we want without too much trickery. Some of it will be not highly efficient, but I think that should be alright over HTTP/2, and I think the operation is sufficiently rare anyway.

If it's not, then we can always add inheritedacl in the future.

csarven commented 4 years ago

I agree. This is where feedback from multiple independently built implementations will help to determine whether without inheritedacl is sufficient or useful to have as a MAY to encourage use.

acoburn commented 4 years ago

feedback from multiple independently built implementations

I just added support for this in Trellis. Lacking clarity around the value of rel="X" for this case, I created a term in the Trellis vocabulary: trellis:effectiveAcl, but if this will be included in the spec, it would be great to define and publish an IRI or use an IANA-based term for this.

In other words, for a resource without its own ACL, the response headers would have this structure:

Link: <https://example.com/container/resource?acl>; rel="acl",
      <https://example.com/container?acl>; rel="http://www.trellisldp.org/ns/trellis#effectiveAcl"

justinwb commented 4 years ago

Resource at /foo/bar/resource.

I assume that /foo/bar/resource doesn't have a Link header with rel=acl .

Why wouldn't /foo/bar/resource have a Link header with rel=acl? This is the part that's confusing for me.

So as it stands, I would still support rel=acl always linking the most specific ACL, 404 or not.

👍

csarven commented 4 years ago

Why wouldn't /foo/bar/resource have a Link header with rel=acl? This is the part that's confusing for me.

It is important to test the assumptions.

If we can expect rel=acl to always be present even while the target may not actually have a representation, I think it is equally important to see what's possible with minimal information. I've demonstrated that it is in fact possible (re Link rev) but it presses on other premises (ie. only server making the association). I've eliminated the LINK method approach along the way. It also shows that the rel=acl being present is not the key bit of information. It actually came down to distinguishing between specific ACL and the inherited ACL in conjunction with only the server making the association.

I didn't want to take anything for granted because it is very easy to look over stuff. FWIW, this thread at least shows what's explored and documented. It tried to justify the original proposal! There is useful information here besides the rough consensus IMHO.

Vinnl commented 4 years ago

Repeat 1 until an ACL resource is found, or we are at the root (which MUST have one).

@csarven told me that a Resource does not advertise an ACL in the Link header if the user does not have Control access, so from the point of view of a client, that's not a MUST, I suppose?

(He also asked me to bring up such questions here.)

csarven commented 4 years ago

a Resource does not advertise an ACL in the Link header if the user does not have Control access

Yes, most current rough consensus.

so from the point of view of a client, that's not a MUST, I suppose?

You're right that a client may not observe a link relation for the ACL on the root container.

I think when this issue got ported, it carried the MUST. The context of the original algorithm is that server certainly has an ACL for the root container and so it'll check that as last option. Note also that the algorithm is framed in a way that a user account is part of the same service has Control access on the root container. Given that a WebID (and associated user account) can be on any system, I think the server inheritance algorithm should be revised and omit "user account" in step 5:

The root container MUST have an ACL resource specified.

It still holds that the link relation for the ACL is only advertised if client has Control access of a resource (including root container).

Steps 3, 4 of ACL inheritance algorithm for clients should be along these lines:

If 404, [TBD (see below): if root Container, stop. Otherwise,] construct the URL of the parent container by stripping it to the closest slash.
Repeat 1.

TBD: Check if root Container - https://github.com/solid/specification/issues/153#issuecomment-624630022 includes proposal to determine Storage/Workspace root eg. URL has Link rel=type Storage.

bblfish commented 3 years ago

a Resource does not advertise an ACL in the Link header if the user does not have Control access Yes, most current rough consensus.

I vehemently disagree on that. Control access should only limit who can write to the ACL. Not who can read it.

csarven commented 3 years ago

https://github.com/solid/specification/issues/257#issuecomment-830053999

Before diving deep into that, I suggest checking with the current state of the Protocol and .. upcoming WAC. Or just hang on for a bit.. :)

See also: https://github.com/solid/specification/pull/252 which corrects the above / what was mistakenly introduced intothe spec.

bblfish commented 3 years ago

Ok, I'll wait. I am implementing this now in Reactive Solid, so that is why I am looking here for guidance.

csarven commented 3 years ago

Thanks for this issue and discussion. Closing this issue as consensus is deemed to be captured in WAC Editor's Draft: https://solid.github.io/web-access-control-spec/ . See #effective-acl-resource . Please use https://github.com/solid/web-access-control-spec for future discussion.

solid / specification

Explain the ACL inheritance algorithm for clients #106