core-wg / href

Other
2 stars 0 forks source link

Revisiting the just-slash and no-slash in URIs with authority #70

Closed chrysn closed 12 months ago

chrysn commented 1 year ago

I'm in the process of aligning aiocoap with the CRI draft; essentially, I'd like to avoid that URI string processing happens unless the user explicitly gives something in form of a URI. (For context, inside a server resource paths are stored as lists, so the handler for /time/current/ is placed with a key of ["time", "current", ""] already).

The good thing is that for this purpose I can ignore all the advanced CRI features as I can't convert them anyway.

The bad thing is that path conversion is only trivial for most cases. I've assembled a table of a few relevant and illustrative cases, only looking at the list of Uri-Path and path items:

CoAP CRI
coap://x [] []
coap://x/ [] [""]
coap://x/a ["a"] ["a"]
coap://x/a/ ["a", ""] ["a", ""]
coap://x//a ["", "a"] ["", "a"]
coap://x// [""] ["", ""]
coap://x//a/ ["", "a", ""] ["", "a", ""]

(No query or fragment parts are shown because they're fully orthogonal; same for relative references.)

If this table is correct, the CoAP option list and CRI are distinct if the CRI list consists exclusively of empty strings. Interoperability has already been hurt often enough by people ignoring empty path segments, so I'd like to avoid giving this as general implementor advice, and the precise rule is tricky ("CRI has one more empty path segment if all the path segments are empty"). This is made worse by the normalized form (according to 7252 Section 6.3, also 7230 Section 2.7.3) of the pathless URI being the one with the slash.

Proposal

It's not fully thought through, but here's an idea:

Let's alter the rules of path assembly to be exactly those of CoAP.

The reason we didn't do this is that if we want to express every URI, we'll have to distinguish empty-path from just-a-slash (for schemes that don't have the normalization that CoAP and HTTP have). So we'll still have to provide that -- but it can be a feature, eg.

AUTH-NOSLASH = false .feature "empty-path"
path /= AUTH-NOSLASH

(I didn't check whether the false value would produce any ambiguity, but I'm confident there is a value that does not).

Upsides

Downsides

chrysn commented 1 year ago

Hm, I found why I did like that approach: Dealing with sub-sites is easier that way. If there is always an empty component on a trailing slash, you can have the same processing on all requests (serving from the root) or all requests that start with ["path","to", "resource"] after those are stripped -- and in both cases the handler needs to be prepared for either an empty component (to serve the "index") or a series of path components.

I'll go into a session of banging my head on the desk to find out what that means for multiple empty components, and whether that allows us to give out more practical advice to implementers.

chrysn commented 1 year ago

One more thought in delense of the current scheme: This might encourage application authors who use CRIs from the start to not have /sensors/ and /sensors/1 (as one would with URIs to make relative references neat) but just /sensors and /sensors/1 because CRI references can discard 0. Works out in that mindset, but once media types using URI refs are used, baeargh.

Question is: do we want to encourage it?

chrysn commented 1 year ago

Just noticed my last post is quite a non-sequitur without emphasizing what that means for implementations of the style I think practical (i.e., applications get all requests that contain some path prefix, and then process the remaining path and other options):

Switching styles to not have a trailing slash makes them work easily no matter whether they're on the root or under some non-empty prefix.

At least in terms of CoAP options.

In URI options, it's weird again b/c 7252 chose trailing slash to be normalized for empty paths. That also translates to weirdness to CRI though cries in rage at the topic's elusiveness

cabo commented 1 year ago
coap://x// | [""] | ["", ""]

Why? (See RFC 7252, 6.4, step 8.)

I thought that coap://x/ is the only case where CRI keeps the empty segment while CoAP discards it.

chrysn commented 1 year ago

If it were, how would coap://x// be represented? AFAIR normalization doesn't remove that.

cabo commented 1 year ago

If it were, how would coap://x// be represented? AFAIR normalization doesn't remove that.

Two empty Uri-Path options in CoAP, two empty Path segments in CRI.

cabo commented 12 months ago

➔ So we'll mention anomaly in table row 2, but take no other change.