indieweb / indieauth

IndieAuth.net website code and IndieAuth Specification
52 stars 7 forks source link

Remove requirement for same domain #56

Closed Zegnat closed 3 years ago

Zegnat commented 3 years ago

AFter the merge of #53, it looks like we could potentially get rid of the same domain requirement.

One use-case this could open up for:

  1. I go to indieweb.org and I login with the domain zegnat.net.
  2. zegnat.net links to the same authorization endpoint as vanderven.se/martijn does.
  3. Even though zegnat.net was the canonical profile URL discovered by the client (indieweb.org), the authorization endpoint can answer with a me value of https://vanderven.se/martijn/.
  4. When the forging check from #53 is done, it proofs the AS is trusted by vanderven.se/martijn and the login can continue with this me value.

This would allow people to login using short domains, even if those short domains host different content and are not a redirect.

This PR is both to identify where the spec needs changing, but also as a ticket to discuss possible other security issues. So please leave pros and cons!

dmitshur commented 3 years ago

This is a question to understand the suggested change better. Are all the existing Redirect Examples still applicable, relevant and useful after this change? Is it worth adding another entry to demonstrate what this change enables?

Edit: I think this paragraph highlights the difference compared to what’s demonstrated in existing examples:

This would allow people to login using short domains, even if those short domains host different content and are not a redirect.

Zegnat commented 3 years ago

Are all the existing Redirect Examples still applicable, relevant and useful after this change?

The idea presented here is basically an iteration on what is enabled by the check from #53. As such I had not given the Discovery by Clients step much thought.

One of the planned todos that came out of the IndieAuth Pop-up session was to clarify that the extra discovery steps from #53 only need to happen when the returned me value does not match the already discovered:

TODO: Note that the extra verification applies only if the original and current profile URL do not match EXACTLY.

One could also argue that if we do not add that todo, and always ask clients to rediscover the endpoint on the given me, there is no longer any reason for clients to discover a canonical profile URL at all. As they will always be allowed to use whatever the AS is returning as me value. That would basically get rid of the redirect logic in its entirety.

So I guess that begs the question: what does the client need the canonical profile URL for?

Edit: I think this paragraph highlights the difference compared to what’s demonstrated in existing examples:

This would allow people to login using short domains, even if those short domains host different content and are not a redirect.

It does. But it has nothing to do with the redirect logic per se. What this change does is allow for the AS to return any me value regardless of what URL the user entered into the client, and still allow the client to accept that me value because it validates trust rather than hardcoding that trust is on hostname/domain basis.

I am not sure if it is worth writing up examples of this in the spec itself. Nor am I sure in what section of the spec that example would go.

aaronpk commented 3 years ago

I'm generally in support of this, but it feels like we should discuss this a bit more before merging. I like that it removes the need to specify a same-domain match, which would otherwise require URL parsing, which has its own issues. This can be implemented with simple string matching so seems safer.

what does the client need the canonical profile URL for?

Clients will need the canonical URL when they want to have a stable identifier for the user, for the case where the client may be storing data about the user such as the list of wiki edits a user has made, preferences within an app, etc.

I do think we should add examples of this to the redirect examples section. We should also add a counter example showing when a client should reject a case.

Zegnat commented 3 years ago

I'm generally in support of this, but it feels like we should discuss this a bit more before merging.

Agreed! The PR exists mostly as the conversation starter :+1:

what does the client need the canonical profile URL for?

Clients will need the canonical URL when they want to have a stable identifier for the user, for the case where the client may be storing data about the user such as the list of wiki edits a user has made, preferences within an app, etc.

Surely it should be using the me value returned at the end by the authorization endpoint for this?

As the canonical URL could also be a multi-user domain (mastodon.example) where the authorization endpoint will return the user specific identity after login (mastodon.example/@example). And with the new verification steps from #53 this user specific identity can be verified without having to compare it with some automatically decided canonical URL from discovery.

Zegnat commented 3 years ago

Pushed a rebase on current main branch after the big https://github.com/indieweb/indieauth/commit/613364c29e4b559cd12d7f0ffc1455d716a66edf update.

nekr0z commented 3 years ago

Just for the info: OpenID explicitly defines a way to specify a URL to be authenticating as in the form of <link rel="openid.delegate" href="https://vanderven.se/martijn/">. The reason they included it is somewhat reverse to what @aaronpk mentioned, but in actual reality I've seen providers using this backwards to do exactly what is discussed here (i.e. any website that is used to authenticate using the same openid.delegate is considered the same user account).

Zegnat commented 3 years ago

So if I am reading that OpenID link correctly, that seems to work like this:

  1. On https://zegnat.net/ I have a document that says that https://vanderven.se/martijn/ may identify as me.
  2. When a client is pointed at zegnat.net it will actually go to vanderven.se/martijn/ and do the entire IndieAuth flow.
  3. When that IndieAuth flow goes through and the authorization (and/or token) endpoint acknowledges the login with me=https://vanderven.se/martijn/ the clients treats this as if my identity is https://zegnat.net/.

This seems slightly different from what I am proposing here. What I am proposing is more like:

  1. When a client is pointer at vanderven.se/martijn/ it does the entire IndieAuth flow.
  2. When that IndieAuth flow goes through and the authorization (and/or token) endpoint acknowledges the login it can return a me=https://zegnat.net/ because the user wanted to use this as identity.
  3. When the client gets this completely new me it verifies that https://zegnat.net/ acknowledges the authorization endpoint that was used.

So step one becomes step 3 and the authorization endpoint is the one deciding what domain the user wants to be using.

The only real difference here seems to be that the authorization endpoint always needs to know about the URLs the user wishes to use. While in the OpenID case the authorization endpoint does not need to know this and the user can basically setup “aliases”.

I wonder if there is any security implication with this aliasing that OpenID allows there.

Zegnat commented 3 years ago

After more conversations in the last couple of days, I feel even more that this is the logical way forward. I revisited this PR to try and be even more clear. Changes broken out in 3 separate commits for (hopefully) easier reviewing.

  1. If we do not do any same-domain checks, we do not need to discover anything to check against at the start. This means we should be able to rid of all the weird rules around redirect following. This has always been somewhat confusing, and turned out to be a huge chore. Even described as harder than adding the much anticipated security feature PKCE.
  2. Because discovery does not give us a confirmed canonical profile URL, we should not describe it as such. Lets refer to it as the user provided URL instead. Also make sure the authorization endpoint and token endpoint flows use the exact same wording to refer to this.
  3. Be more specific about the way we verify the URL: the client should rerun discovery. Add the optional step where the client gets to skip doing this if it already knows what it will result in, this happens if the canonical profile URL returned by the AS matches a URL it visited during discovery.

I am not sure I am super happy about the wording of point 3. Open to tweaking there. Also wondering if we need to do a call-out higher up in the document about what differing means, I guess we mean a literal 1-to-1 string match failing, but not sure that is clear to a first time reader.

aaronpk commented 3 years ago

I think this makes a lot of sense. I'm liking how it removes a lot of text from the spec that is pretty tricky to understand.

The only suggestion I'd have for this PR is to add back the examples. While they aren't necessarily as important as they were previously, I do think it's a good idea to include them so that people understand that they do have the option of entering a different URL to start with compared to the final profile URL that's returned, and all the other various combinations. I think the examples that are important to list with the new language will likely be very different from the ones that were there before, but I'd hate to see the examples section go away entirely.

Off the top of my head, the examples I'd expect to see are:

Are there any other likely common scenarios?

Zegnat commented 3 years ago

The reason I removed the redirect examples is because they have no bearing on discovery of endpoints (clients always resolve all redirects before looking for them) nor on the profile URL (clients always take the me value, and verify it). Whether the user provided URL redirects from http to https means nothing to the IndieAuth flow.

Are there any other likely common scenarios?

Not common, but maybe worth it to explicitly mention:

I can type in “aaron@parecki.com” in an indieauth prompt and it works. because that is a URL.

I am just wondering where this would go. If we can solve that well, I think it would close #65.

E.g. this would be completely valid:

  1. I enter martijn@zegnat.net in the client.
  2. The client applies the steps from URL Canonicalization to URLify it: http://martijn@zegnat.net/.
  3. The client does a GET request and encounters the following:
    1. http://martijn@zegnat.net/ 301 redirects to https://martijn@zegnat.net/ because the server wants to enforce HTTPS.
    2. https://martijn@zegnat.net/ 301 redirects to https://martijn.zegnat.net/ because the URL usernames is "a bit of a 'hack'".
    3. https://martijn,zegnat.net/ 302 redirects to https://zegnat.net/ because currently there is no such user-specific page.
  4. The client discovers an authorization endpoint on https://zegnat.net/: https://sso.zegnat.net/.
  5. The client does the OAuth code exchange with https://sso.zegnat.net/ all the way through. This results in a final return JSON that includes a me value of https://vanderven.se/martijn/ as the canonical Profile URL.
  6. The client has not seen https://vanderven.se/martijn/ before, it redoes discovery on that URL and find the same authorization endpoint: https://sso.zegnat.net/.
  7. The client accepts https://vanderven.se/martijn/ as the identity for me.

Nowhere in any of the steps past 3 (discovery) did the redirect path matter in the slightest. Except for in step 6 where the client could optionally skip redoing discovery if it has already seen the Profile URL during its previous discovery.

I think there is a bigger value in making it clear that the user-entered URL at the start is not expected to be the canonical Profile URL, than in giving a bunch of examples of how redirects are allowed to function during discovery. Unless there is some specific redirect we do not want to allow at all during discovery.

aaronpk commented 3 years ago

Okay I see what you mean about those redirect examples not being necessary, but in the current spec that section is the only place that really shows how someone can use two different domains. We could move the examples into the security considerations section "Differing User Profile URLs" since that's where the text that enables this lives.