Closed hylkevds closed 3 years ago
@hylkevds Take a look at RFC 6570. URI Templates are widely used and are the path syntax we use in the spec. If your identifier is not legal under RFC 6570, then we cannot use it. If your identifier is legal but looks like a hack, then we probably shouldn't use it. Finally, keep in mind that the URL will be stored in a buffer. Buffers are not of infinite length. If our URLs grow too long we will start having problems with truncation and buffer overflow. Problems easily avoided.
RFC 6570 will happily let you expand templates into URLs that are ambiguous, or will be rejected by many web servers. So pointing to RFC 6570 is not enough to ensure valid URLs. Limits on what "we" consider valid identifiers should be made explicit.
What was the resolution on this? It is marked as OBE, but can't find anything documented here.
2020-08-24 SWG Telecon.
The issue was discussed previously in the OGC-NA.
@ghobona To capture the OGC-NA decision and close this issue.
OGC-NA addressed the issue in https://github.com/opengeospatial/NamingAuthority/issues/55#issuecomment-643399279
Any restrictions on identifiers or paths to resources should be designed and applied by the individual OGC API standards.
@ghobona Well, so then it seems that closing this is not correct. Because OGC NA didn't decide anything on this issue and it needs to be solved for Commons Part 2 (Collections). I guess the answer is / should not be used, but we have existing dataset IDs with slashes. How to expose them via OGC APIs? I don't think changing IDs is an option.
@m-mohr I see what you mean. Although the issue is out of scope for the OGC-NA, a technical solution still has not been identified.
I agree, the issue should remain open.
Can someone clarify what the open issue is? Slashes can be used, they have to be encoded as %2f
. That is, if one really wants to use slashes in an id, too, and not just in the name/title.
@m-mohr I think the approach suggest by @cportele in https://github.com/opengeospatial/oapi_common/issues/34#issuecomment-679969763 solves the issue. Could you please confirm?
@ghobona Indeed, I thought somehow using %2F would be discouraged in the HTTP standard like it is with dots for example, but it's not. On the other hand, it seems there's no mention about ID encoding in the standard, which I guess would clarify this. Another issue might be that OpenAPI doesn't allow slashes in path parameters (see https://github.com/OAI/OpenAPI-Specification/issues/892 ) and I'm not exactly sure it's solved by percent encoding. So basically that's what is confusing me and thus I asked for clarification.
Having a "/" in an identifier of something that can end up in a path parameters seems a terrible idea
BUT
Amazingly it seems that %2F works in browsers.
For example http://www.creaf.cat/research goes to a web page of my institution http://www.creaf.cat/research/mediterranean-basin goes to a web page of my institution AND http://www.creaf.cat%2Fresearch goes to Google as a search term in chrome, edge and firefox http://www.creaf.cat/research%2Fmediterranean-basin returns a 404.
So it seems that @cportele is right and the escape works in practice and it is not interpreted as a /.
In my particular test cgi application in IIS what happens starts worrying me a bit. http://joanma.uab.es/cgi-bin/cgi_temp.cgi?kk%2Fkk
As you can see I get the query part as as the first argument of the application as "kk/kk" (transformed) and not transformed in the environment variable: QUERY_STRING=kk%2Fkk
Then we have another question: What a slash in an id means? Imagine a collection called "day/night" and a collection called "transport/roads". In the first case is a character in an id. In the second seems a hierarchical collection name. Do we have to scape the first but not the second?.
@m-mohr Strictly, ID or URI encoding in general does not have to be discussed in the standard as the rules from the normatively referenced RFCs apply, but an informative mention might help (which is why we added a sentence in Features).
The OAS issue that you cite is about something else (supporting path parameters that are not just a single path segment). Nothing in OAS prohibits the use of slashes in parameter values AFAIK.
2020-12-02 OGC-NA meeting today did not reach agreement on this issue being out of scope of the OGC-NA. As a result, the next step will be to establish the rules for slashes and other characters in names/ids.
@ghobona Not sure I follow. Did not reach agreement? Or decided it was out-of-scope?
Who owns the next step? Is it on OGC-NA or on the SWG?
The proposal was to rule that the issue is out of scope. We could not reach agreement on the proposal. Therefore, the next step is for the OGC-NA to propose the rules for slashes and other characters in names/ids. So the OGC-NA owns the next step.
It seems to me that since IDs cannot contain slashes, URL-encoded slashes as part of the path component would be very confusing (path separators vs. path character inside an ID) and a recommendation should discourage them.
I would be very interested in clarifying the rules around the use of :
, and exactly where/when they need to be URL-encoded, as it is what we currently proposed for hierarchical collections (https://github.com/opengeospatial/ogcapi-common/issues/11#issuecomment-677947387).
January 25 SWG: This is a useful discussion but for the most part is not normative. Move this discussion to the Users Guide and reference that section from the Standard. Change label to guide after link is added to common. Add this note to the link: "Note: the id can be anything but the resulting URI must conform to the RFC requirements for encoding."
Added note to section 6.2 "OGC Web API standards may include a community-defined identifier as part of a URI (ex. image id or feature id). Definition of the format of those identifiers is out of scope for these standards. Implementors should take care that these identifiers are properly encoded (see RFC 3986) in the URIs for all hosted resources."
The link to the Users Guide was already in place.
Feb 1 - closed - NOTUC
Ontology people like to use URLs as identifiers, but I can't find anything specifying how to encode identifiers in URLs, or what the allowed characters are. I'm guessing the
GET /collections/{collectionId}/items/{featureId}
pattern is going to be confused if I make a collection namedcollections/collections/items/items
and an item nameditems/items
. This would make:GET /collections/collections/collections/items/items/items/items/items
This is a silly and extreme example, but it illustrates the problem well.