indieweb / indieauth

IndieAuth.net website code and IndieAuth Specification
52 stars 7 forks source link

Ambiguity in handling redirections #36

Closed fluffy-critter closed 4 years ago

fluffy-critter commented 4 years ago

According to https://indieauth.spec.indieweb.org/#discovery-by-clients:

If an HTTP permament redirect (HTTP 301 or 308) is encountered, the client MUST use the resulting URL as the canonical profile URL. If an HTTP temporary redirect (HTTP 302 or 307) is encountered, the client MUST use the previous URL as the profile URL, but use the redirected-to page for discovery

This does not answer the question of what happens if a profile URL has a temporary redirect to another URL, which then has a permanent redirect to yet another URL.

For example, if the user profile URL is at https://username.example and there is a temporary redirect to https://example.com, the second MUST indicates that the profile URL should be unchanged. However, if https://example.com then has a permanent redirect to https://example.com/username, the first MUST implies that the profile URL should be updated to https://example.com/username even though the first redirection was temporary.

Perhaps rephrasing the paragraph like this would help:

Clients MUST start by making a GET or HEAD request to [Fetch] the user's profile URL to discover the necessary values. Clients MUST follow HTTP redirects (up to a self-imposed limit). If an HTTP temporary redirect (HTTP 302 or 307) is encountered, the client MUST use the previous URL as the profile URL, but use the redirected-to page for discovery. If an HTTP permament redirect (HTTP 301 or 308) is encountered, the client MUST use the resulting URL as the canonical profile URL if there had not previously been a temporary redirect.

dmitshur commented 4 years ago

I also found it hard to be confident how to interpret the current wording in a situation involving both permanent and temporary redirects.

My best interpretation has been to understand that if there is a sequence of redirects and they are all temporary, then the original user profile URL is considered canonical. If there is one or more permanent redirect involved, then the URL of the last permanent redirect is considered canonical.

I wrote an example based on that interpretation, which looks like this:

suppose user entered url: https://a.example

HTTP GET https://a.example
 ⤷ 301 to https://b.example
 ⤷ 301 to https://c.example
  ⤷ 302 to https://d.example
  ⤷ 302 to https://e.example
  ⤷ 302 to https://f.example
   ⤷ 301 to https://g.example
   ⤷ 301 to https://h.example
   ⤷ 301 to https://i.example
    ⤷ 302 to https://j.example
    ⤷ 302 to https://k.example
    ⤷ 302 to https://l.example
     ⤷ 200 OK, Content-Type: text/html; charset=utf-8

end result canonical profile URL: https://i.example
look for authorization endpoint at: https://l.example

And I've reached out to @aaronpk for his thoughts on that example.

I've also noticed that the 303 See Other status code is not mentioned in the spec. I currently understand it should be treated as a temporary redirect, but it could be made more clear.

fluffy-critter commented 4 years ago

Hmm, would a 303 ever come up naturally? I've never seen that in the wild as far as I know, and per the linked Wikipedia page the intent is for returning a resource redirect after a POST, which should never happen for a profile lookup.

That said, from the phrasing, 303 does seem like it should be handled the same way as a 302.

Zegnat commented 4 years ago

From my understanding (and recollections of previous discussions) the result canonical profile URL in @dmitshur’s example is https://c.example. That is the final URL settled after a continues sequence of permanent redirects starting at the input.

Permanent redirects are in place to update the canonical URL. If a publisher decides to use a temporary redirect, it is because this may rapidly change in the future and should not be seen as the canonical location for the resource. (Or in this case, the canonical URL for their identity.)

If the temporary place uses permanent redirects again, that is just because the temporary resource is moving around. Not because the original publisher has decided to make any of that their new canonical location.

In short: as soon as the first non-permanent redirect is encountered, it is that redirect’s origin URL that is the canonical URL.

dmitshur commented 4 years ago

Thanks for providing that information @Zegnat. That rationale makes sense, and I will update my current understanding (and my WIP implementation) to match what you've described.

Edit: The updated implementation and test case can be seen here.

fluffy-critter commented 4 years ago

Any more thoughts on reasonable verbiage for this spec change? I'm still struggling to come up with unambiguous, concise RFC-ese.

Zegnat commented 4 years ago

I think this is a tough one. Especially because the expected behaviour is so ingraned to me already that I have a hard time interpreting the original text as anything other than working. But maybe something along the lines of the following?

Clients MUST start by making a GET or HEAD request to [Fetch] the user’s profile URL to discover the necessary values. Clients MUST follow HTTP redirects (up to a self-imposed limit). If one or more successive HTTP permanent redirects (HTTP 301 or 308) are encountered starting with the very first request, the client MUST use the final redirected-to URL as the user’s canonical profile URL. Temporary redirects (HTTP 302 or 307) break this chain. Temporary redirects must still be resolved for discovery but do not update the user’s canonical profile URL.

I think for clarity we may also want to add another example. But I am not sure how to create an example without getting very verbose with a tree like the one @dmitshur showed above…

dmitshur commented 4 years ago

@Zegnat I like that wording and I think it covers my test case quite clearly.

My main suggestion would be to also consider either adding 303 status code to the list of temporary redirects, or avoiding enumerating an incomplete list.

fluffy-critter commented 4 years ago

303 should never come up in a profile-discovery context.

aaronpk commented 4 years ago

Thanks for the proposed text! That has been merged.

fluffy-critter commented 4 years ago

Yay! I am now technically a W3C specification contributor! Time to update my resume. ;)