w3c-fedid / FedCM

A privacy preserving identity exchange Web API
https://w3c-fedid.github.io/FedCM/
Other
383 stars 73 forks source link

Allow IDPs to use multiple config files within an eTLD+1 #552

Open cbiesinger opened 7 months ago

cbiesinger commented 7 months ago

This is a proposal to solve some of the use cases from issue #333

The Problem

FedCM makes a credentialed fetch to IdPs to gather the user’s accounts before prompting the user to select one of those accounts and then gather the user’s explicit permission to share the selected account’s details with the RP. Because of that credentialed request, FedCM has to make sure that the relying party (RP) cannot collude with the IdP in a way that would allow the user to be tracked before user permission.

Early on, an attack was identified, where the RP could insert its location in the credentialed fetch URL using the JS API:

const cred = await navigator.credentials.get({
  identity: {
    providers: [{
      url: `https://idp.example/${window.location.href}`
    }]
  }
});

So, we introduced a .well-known file that forced the IdP to enumerate the valid configURLs that the IdP supports (as a list of “provider_urls” in the .well-known-file), so that it wouldn’t be possible to collude:

{
  // This array size is kept small (currently at 1), so that there is not enough
  // entropy to distinguish relying parties.
  "provider_urls": ["https://idp.example/config.json"]
}

The JS API still requires the RP to specify the desired configURL to use, and FedCM ensures that the configURL specified in the JS API call matches one of the configURLs listed in the .well-known file:

const cred = await navigator.credentials.get({
  identity: {
    providers: [{
      configURL: `https://idp.example/config.json`,
   ...
    }]
  }
});

That solution worked generally well for a while until IdPs needed more than 1 configURLs (eg., to support multiple test and production configurations): every extra URL we allow in the providers_url array introduces an extra bit that IdPs and RPs could use to collude with each other.

cbiesinger commented 7 months ago

Proposal

The config files that the configURLs point to specify a number of IdP endpoints. However, the core of this proposal is the fact that the only endpoints that we need to effectively protect against collusion are (a) the “accounts_endpoint”, because it is the only credentialed endpoint that is used before any user explicit permission is given and (b) the “login_url”, because it is loaded in a pop-up window where the user can enter their username/password and joining the RP/IdP identities before explicitly giving a browser mediate permission.

With that in mind, the proposal is to take advantage of that fact, and (a) introduce an “accounts_endpoint” and a “login_url” parameter to the “well-known” file and (b) skip the check when the IdP and the RP are in the same eTLD+1:

{
  "provider_urls": ["https://idp.example/fedcm.json"],
  "accounts_endpoint": "https://idp.example/acccounts",
  "login_url": "https://idp.example/login"
}

And change the algorithm that checks for the well-known file to be:

  1. When “accounts_endpoint” is not provided, continue as usual (e.g. check that the configURL is listed in provider_urls).
  2. When “accounts_endpoint” / “login_url” is available, ignore the “provider_urls” property and check if the “accounts_endpoint” / “login_url” matches the value used in the configURL passed. (open question: should we make accounts_endpoint optional in the configURL if it is provided in the well-known file?)

That allows us to:

  1. Maintain backwards and forwards compatibility with existing “well-known” files and “old version of browsers” that are already deployed in the wild and
  2. Have arbitrarily many “configURLs” – as long as they all point to the same accounts_endpoint and login_url
  3. Have no opportunity for entropy to be added to the credentialed fetch request made to the accounts_endpoint, since it has to be specified at the “well-known” level

The shortcoming of this proposal is that it doesn’t allow an IdP to have multiple different “accounts_endpoint”, but (a) this isn’t more strict than what it already is and (b) we haven’t run into a real-life case where that’s a problem, so the intuition is that this is a solid step forward that can take us a bit further until we run into another deployment scaling challenge.

End State

It is plausible that, once we introduce the check for “accounts_endpoints”, we could potentially:

  1. deprecate the “provider_urls” parameter in the well-known file and
  2. deprecate the “accounts_endpoint” and the “login_url” in the configURL.

The well-known file would then just look like this:

{
  "accounts_endpoint": "https://idp.example/accounts",
  "login_url": "https://idp.example/login"
}

An open question is whether we should change FedCM to require this new syntax and force IDPs to provide the two new fields, or allow IDPs to choose either option. IDPs that do not need the flexibility of multiple configURLs may prefer keeping the list of endpoints in a single file and therefore stick with the old syntax.

Alternatives Considered

Allow multiple URLs in the provider_urls field in the well-known file

Because the configURL is selected by the RP (or the IDP’s SDK), this provides additional bits of entropy – each configURL can specify a different accounts endpoint, to which the browser would then make a credentialed request. In addition, it is not clear what the right number of URLs to allow is. There is no principled argument why 2, 3 or any other number is the correct value here.

Fetch the .well-known file from the Login Status API site rather than the eTLD+1

For the specific use case of staging environments, we could instead do the following:

  1. Only allow calling the login status API from one subdomain per eTLD+1
  2. Only allow FedCM calls to the subdomain that was used by the login status API. We would still need to fetch a well-known file from that host (so that you couldn’t use entropy in the URL).

This would allow staging environments to assume that the user is only logged in to the staging instance.

However, it would not help IDPs who need different configurations in production.

Caching the accounts during the Login Status API

If the account list were provided by the login status API ahead of time, we wouldn’t need to make a credentialed request at all and could allow an unlimited number of config URLs. The navigator.setLoggedIn call could take an array of account objects that specify the ID, name, email, etc.

But this would add a lot of complexity around expiration times, how to keep the name/picture fresh, etc.

wseltzer commented 1 month ago

Discussed at TPAC 2024 https://github.com/fedidcg/meetings/blob/main/2024/2024-09-24-TPAC-notes.md#multiple-config-urls

Proposal advanced to Stage 2