Implicit assumptions about status codes

joachimvh commented 2 years ago

I recently spent some time making the Community Solid Server conform to the status codes requirements in https://github.com/solid/specification/issues/14#issuecomment-683480525. Below are some comments mostly meant as feedback on how these are described in the spec.

One first takeaway is that GET/DELETE requests that target a resource C/R that does not exist need to return a 404 if the agent has Read permissions on either C/R or on C/. I get the idea, in the first case you could first GET C/R to see that it does not exist, and in the second case you could GET C/ and look at the containment triples to get this information. But I couldn't find anything in the spec saying this explicitly, and I do think it would have to have that instead of these implicit assumptions. It also makes me wonder about the case of doing a POST on C/ if C/ does not exist. Is that a 404 if you do not have Append permisisons on C/ but you do have Read permissions on its parent container?

Another case that caused some issues is a DELETE targeting C/R. While the operation would be valid and allowed if the agent has Write permissions on both C/ and C/R, it would be forbidden if C/R does not exist instead of returning a 404. From how I understand it, the reasoning is that this would leak information about the containment triples of C/. The only thing I could find in the spec about this is the non-normative Security Considerations section. Specifically in this case I also wonder why this specific exception is made. In case C/R does exist the request would be allowed, but would this also not leak information about the containment triples of C/? Since if this request succeeds we know that resource existed. To exaggerate, why do I not need Read permissions on C/ when trying to read C/R if it does exist, since that also provides us with information about C/. Unless of course I misinterpreted the original reason the DELETE request was forbidden.

Besides those specifics, a general problem that I had is that I wasn't sure if the linked table talked specifically about WAC, or Solid in general (or a mix between both). E.g., the fact that a PUT targeting C/R needs Append permissions on C/ if C/R does not exist is specified in the WAC spec. But I guess that any other authorization scheme that is put on top of a Solid server should follow similar behaviour since such an action will always modify the containment triples of C/. And vice versa, the 404 response in the first example above, is this a WAC specific thing as well or does this follow from implicit assumptions in the Solid core spec, making it authorization scheme independent?

csarven commented 2 years ago

I gave a bit more detail in https://github.com/solid/specification/issues/379#issuecomment-1078876775 that covers some of your questions/feedback. I'll follow up with a PR clarifying further. Thanks for the feedback!

One first takeaway is that GET/DELETE requests that target a resource C/R that does not exist need to return a 404 if the agent has Read permissions on either C/R or on C/.

No. Server doesn't need to but it might want to in those specific cases.

I get the idea, in the first case you could first GET C/R to see that it does not exist, and in the second case you could GET C/ and look at the containment triples to get this information. But I couldn't find anything in the spec saying this explicitly, and I do think it would have to have that instead of these implicit assumptions.

It is intended to be a simple interpretation and application of the relevant RFCs in context of the Solid Protocol. The point of issue 14 comment/tables was to be clear about how the RFCs, Solid Protocol, and an authorization system (guided by WAC) fit together e.g., there is a specific understanding of hierarchical containment in Solid Protocol and that gives us good information to work with / make sense of the RFC. The tables are also intended to get actual feedback from (WIP) implementations and factor all that in - throw rocks at it! :)

It also makes me wonder about the case of doing a POST on C/ if C/ does not exist.

That'd be #server-post-target-not-found in the spec: 404.

Is that a 404 if you do not have Append permisisons on C/ but you do have Read permissions on its parent container?

Correct. That'll be this row in the issue 14 comment for POST (as of 2022-03-25):

Read - 403 404

Another case that caused some issues is a DELETE targeting C/R. While the operation would be valid and allowed if the agent has Write permissions on both C/ and C/R, it would be forbidden if C/R does not exist instead of returning a 404.

Forbidden as in 403, yes.

From how I understand it, the reasoning is that this would leak information about the containment triples of C/.

To be clear, it is to not disclose the existence of C/R. The general "rules" with Read on C/ or on C/R would apply, i.e., Read grants rights to know the existence of C/R, so if C/R doesn't exist, 404 is acceptable. 403 would be the safest/best default when unauthorized.

The only thing I could find in the spec about this is the non-normative Security Considerations section.

Can you clarify the statement from the spec?

Specifically in this case I also wonder why this specific exception is made. In case C/R does exist the request would be allowed, but would this also not leak information about the containment triples of C/? Since if this request succeeds we know that resource existed.

The existence of C/R is only known - given 204 response - when one has the required permissions (Write) and C/R exists. It is isolated (as I see it).

To exaggerate, why do I not need Read permissions on C/ when trying to read C/R if it does exist, since that also provides us with information about C/.

This may be conflating things. No, reading C/R is not the same as knowing the existence of C/R. You can find out about the existence of C/R by seeing it listed in the containment of C/ - the identifier is there! - but that doesn't mean when you GET/HEAD C/R, you get 2xx. Actual reading of C/R - getting the current representation of C/R depends on Read on C/R.

Unless of course I misinterpreted the original reason the DELETE request was forbidden.

I don't understand this.

Besides those specifics, a general problem that I had is that I wasn't sure if the linked table talked specifically about WAC, or Solid in general (or a mix between both).

It is Solid Protocol with some key requirements/concepts alongside an authorization mechanism. The tables are not directly based off WAC - WAC doesn't even require specific response codes! It is perhaps inspired/influenced from WAC or notions from ACL-based systems if you will but I wouldn't go too far beyond that. The table are using common access control modes.

E.g., the fact that a PUT targeting C/R needs Append permissions on C/ if C/R does not exist is specified in the WAC spec.

It seems that way but note the request semantics of say PUT and POST as far as how resources are named as per #resource-type-heuristics . Like POST, we need at minimum add/append level permission on the container. In addition, we need Write to create/replace C/R because that's a specific identifier that an agent wants to allocate. I believe much of this is derived from RFCs.

But I guess that any other authorization scheme that is put on top of a Solid server should follow similar behaviour since such an action will always modify the containment triples of C/. And vice versa, the 404 response in the first example above, is this a WAC specific thing as well or does this follow from implicit assumptions in the Solid core spec, making it authorization scheme independent?

The latter. IMO.

joachimvh commented 2 years ago

The tables are also intended to get actual feedback from (WIP) implementations and factor all that in - throw rocks at it! :)

The main takeaway is that it would be nice to have such a table in a more official document than a GitHub comment, probably more permissive than the one there with a range of acceptable status codes. That way it's more clear what is a MUST/SHOULD/MAY (in the case of these 404s for example).

One thing I noticed and forgot to mention is that the PATCH entry with Read/Write permissions says to return a 404 if the target does not exist. This is not in line with the latest implementation of N3 Patch in the spec where step 1 of the algorithm states to start from an empty dataset if the target does not exist, and step 5 states that a 409 MUST be thrown if there is no match for the delete.

That'd be #server-post-target-not-found in the spec: 404.

The thing is, both here and in other cases, there are 2 Solid spec rules that apply here:

Return 404 if the target does not exist.
Return 401 when doing a POST with no Append permissions. And it is not clearly stated anywhere (that I could find) which rule takes priority. The table and your comments imply the first, but it would be nice to have a reference for that.

Can you clarify the statement from the spec?

The one line that could apply here is "Servers are strongly discouraged from exposing information beyond the minimum amount necessary to enable a feature.".

Unless of course I misinterpreted the original reason the DELETE request was forbidden.

I don't understand this.

DELETE targeting C/R returns 401 if C/R does not exist and you only have Write permissions on C/ and C/R. It returns 404 if you also have Read permissions on C/. My interpretation is that the reason for this is that otherwise we could leak information, but this is a guess since I could not find an explicit reason.

Like POST, we need at minimum add/append level permission on the container. In addition, we need Write to create/replace C/R because that's a specific identifier that an agent wants to allocate. I believe much of this is derived from RFCs.

The thing is that concepts such as Append/Write/Create/etc. are not really defined in the Solid spec. Unless of course this can all be found in the RFCs indeed and I should have a closer look there.

solid / specification

Implicit assumptions about status codes #384