solid / specification

Solid Technical Reports
https://solidproject.org/TR/
MIT License
488 stars 45 forks source link

Specify resource naming constraints unambiguously #368

Open damooo opened 2 years ago

damooo commented 2 years ago

Currently, there is no section unambiguously defining naming constraints for solid resources. Few can be inferred, and others have ambiguity. It would be great if that little ambiguity is resolved.

  1. Solid resources are http resources, and de-referencable information resources. Hence their names must be Http URIs. (Thus no IRIS allowed, as http allows only uris for resource names at protocol level, also this issue), And #fragments are not allowed for they are request targets. This much seems inferable. Are these inferences correct?
  2. Does http uris in origin-form allowed as request name? or only those in absolute-form? Origin-form of request target allows query-params in them. Is that barred for resource names? They will give issues with slash semantics, as rfc3986 allows / in query params, whilst sharing same absolute-form. And they thus also complicates relative-urls. As of now ESS uses uris in origin-form for their ACP auxilliary resource names. like https://pod.inrupt.com/damodara/?ext=acr. etc.
damooo commented 2 years ago

In spec, it mentions that slash semantics only applies to / char in URL path. Thus a / in query-params of an uri in origin-form may not cause issue as in point 2. But it causes much complicatioons further as follows.

If origin-form is allowed to be primary resource uri, then both http://example.org/a/b/?r=def and http://example.org/a/b/?r=pqr are valid as identifiers for two distinct primary resources, as they are literally different. Worse, they both can be distinct containers, as the URL path has trailing slash as required by spec. But clearly their paths are same. It will also complicates uris for their children, etc. Thus origin-form with query params should be disallowed at least for primary resources. There may/may-not be issues with auxiliary resources.

kjetilk commented 2 years ago

Thanks a lot for bringing this up, excuse the silence, I do believe it is important.

I think there is a point of intersection with the auxiliary resources discussion. Indeed, the observation that ESS implements auxiliary resources using origin-form (as opposed to the .foo convention of NSS) is a good one, I believe this is an interesting pattern.

Just to throw a thought forward: Perhaps we should constrain Solid so that normal resources must be of absolute-form, whereas auxiliary resources may be of origin-form?

Other interfaces, e.g. query interfaces, would also typically be allowed origin-form, but since they tend to get a subset of the representation of a resource, it makes sense that they have their own URI.

This would resolve the issue that you point to, @damooo . It could make it possible for servers to implement security measures since the absolute-form is more constrained, and it would make it clear that if the request is of origin-form, it is an auxiliary resource.

OTOH, it would also be constraining quite a lot, so I don't know if there are use cases that would use origin-form for resources. If so, we could relax the requirement later, but it would be good for this to be reviewed through that lens.

damooo commented 2 years ago

Allowing origin-form for normal resources arises lot of unspecified behaviour as described in previous comment. Even not taking security perspective, that will leave so many ambiguities, un-understood behaviours as solid has attached semantics to uri syntax. Thus one cannot know http://example.org/a/b/?r=def, and http://example.org/a/b/?r=pqr both are distinct, if so whether they are containers (trailing-slash-in-path), if-so- how should their children uris should be.

As you stated @kjetilk , it may be better if

  1. normal primary resource uris MUST be in absolute-form
  2. Other auxiliary resources, endpoints MUST be in origin-form

Noting that, absolute-form and origin-form are not mutually exclusive invariants, but absolute-form is specialized case of origin-form.

damooo commented 2 years ago

I am sorry if i am popping out unnecessary corner cases, but they are popping out when modelling identifier space in comprehensive way.

Does auxiliary resource identifiers must have same origin as of their subject resource? If allowed, what all are measures must be taken?

acoburn commented 2 years ago

Must auxiliary resource identifiers have same origin as of their subject resource?

There is no such restriction

kjetilk commented 2 years ago

I am sorry if i am popping out unnecessary corner cases, but they are popping out when modelling identifier space in comprehensive way.

Not at all! Your attention to detail is very welcome and necessary!

We need to have a bit of coordination about what editors should address when, my feeling is that we should address this very soon.

damooo commented 2 years ago

I understand my terminology is muddled up, as i got them from stack-overflow. After reading http spec carefully, here is ontology of identifiers, and request-targets:

  1. Absolute path: path: path-component of an absolute uri
  2. Absolute URI: identifier: http Uri with origin, path, with optional query, but no fragment.
  3. Absolute-uri-with-query: identifier: sube case of Absolute URI, with
  4. Absolute-uri-with-out-query: identifier: sube case of Absolute URI, with out query.
  5. absolute-form: request-target: same syntax as of Absolute-URI.
  6. origin-form: request-target: Absolute path + [? query], with out origin. Origin will be resolved by server

Thus I confused Absolute-uri-with-query/Absolute-uri-with-out-query discriminitation of identifier with origin-form/absolute-form discrimination of request target.

Standing terminology corrected, question is to allow/not-allow Absolute-uri-with-query as identifier to solid resources.