Allow selecting an account without email addresses

samuelgoto commented 2 months ago

@solatsuta I'm reposting this here in this repo from what I learned in this so that it doesn't fall through the cracks

There are few related discussions here (https://github.com/privacysandbox/privacy-sandbox-dev-support/issues/379#issue-2362175507, w3c-fedid/FedCM#317 , w3c-fedid/FedCM#227, w3c-fedid/FedCM#607 and w3c-fedid/continuation#4), but I'm going try to untangle and isolate something that I think is self contained (and will resolve parts of the issues listed above).

There is one problem across these many issues that, currently, email is a required field in the IdentityProviderAccount dictionary used in the accounts_endpoint.

email is used in a couple of places (which is why we've been conflating problems) but one of them is that it is used in the select account algorithm as a disambiguation parameter, for example: selecting one of two accounts under the same name but different email work and personal addresses.

We have heard repeatedly (example) that many IdPs don't necessarily have email addresses for their users, so they had to work around the fact that they are required, e.g. by passing an empty string or another form of username.

How should an IdP expose accounts in the accounts endpoint that don't have email addresses?

samuelgoto commented 2 months ago

The proposal that occurred to @solatsuta @cbiesinger and I in this thread was to:

1) Make email optional in the IdentityProviderAccount dictionary. 2) Introduce username to the IdentitityProviderAccount dictionary, as an optional string. We could also introduce phone_number if IdPs used that too. 3) Check if one or the other is available when fetch the accounts 4) Use one or the other in select an account

Here is an example of a valid accounts endpoint response:

{
 "accounts": [{
   "id": "1234",
   "name": "John Doe",
   // For example, if an IdP has id + email as uniq, has to match emails
   "email": "john_doe@idp.example",        
   // For example, if an IdP has id + phone as uniq, has to be numeric
   "phone_number": "(630) 023-3243",   
   // For example, if an IdP has id + name as uniq, can't contain spaces
   "username": "john_doe_123456789",  
   // For example, if an IdP has id + nickname as uniq, can't contain spaces
   "username": "Jack",                        
   "picture": "https://idp.example/profile/123",
   "approved_clients": ["123", "456", "789"],
   "login_hints": ["john_doe"],
   "domain_hints": ["idp.example"],
  }]
}

One open design question is whether to accept an alternative general purpose arbitrary string (e.g. subtitle) or a type-specific string (e.g. email, phone_number or username) for account selection. The former is more extensible but my intuition is that it could lead to more abuse (e.g. subtitle could be set to "Click here to get 15% off") than a string that we can enforce a certain format (e.g. regexes for valid strings), hence why starting with purpose-specific fields and then generalizing if we run into a wall.

This proposal is independent of, but complementary to, w3c-fedid/continuation#4, which controls not select an account but rather request permission to sign-up.

samuelgoto commented 1 month ago

Just to report back to this thread: we discussed this in the last CG call and got no objections to this. @hlflanagan made a suggestion to make the alternative a "general purpose" display name, rather than something as specific as "phone number" or "username", which seemed reasonable in the CG call. more details in the notes.

The next steps are to gather some implementation experience here and send a spec PR if that goes well.

bc-pi commented 1 month ago

... suggestion to make the alternative a "general purpose" display name, rather than something as specific ...

what about "display name"?

samuelgoto commented 1 month ago

what about "display name"?

Yep, I like displayName.

aaronpk commented 1 month ago

I like "display name" because it makes it clear it's not meant to be a globally unique identifier.

TallTed commented 1 month ago

Somewhere, it will need to be made plain where this "display name" is used, and for what — so that the user doesn't set two "display names" to the same value, which makes it impossible to select "the right one", i.e., "the one associated with the right IdP".

I will often choose my display name to be TallTed — but if I set that on GitHub, and GitLab, and ~~Twitter~~ X, and they each just display TallTed, then I might choose any of them from a selector, when I really wanted to select GitHub in 90% of cases, GitLab in 9%, and X in 1%.

aaronpk commented 1 month ago

I don't know if it's spelled out in the spec, but at least the current implementation also displays the URL of the website the account is from, so using the same display name in two different IDPs isn't a problem to disambiguate.

samuelgoto commented 1 month ago

but at least the current implementation also displays the URL of the website the account is from

We display the URL of the website the account is from AND the IdP logo overlaid over the profile picture.

See example below:

npm1 commented 1 month ago

I personally prefer the original proposal (in this issue) to display_name unless there are reasons we think that is insufficient.

aaronpk commented 1 month ago

my concern with username (assuming that's what you're referring to) is that "username" comes with a lot of baggage that isn't actually relevant to how it's used here.

npm1 commented 1 month ago

Hmm yea. What is the semantics of the string you are thinking about?

aaronpk commented 1 month ago

If I remember correctly, this string is only used to show to the user in the UI, and it's a separate id that is sent to the IdP to show which account was selected, correct?

npm1 commented 1 month ago

This is partially correct. Currently, the field is email and it is used both to let the user know which account they are selecting as well as in the disclosure text. I guess the question is whether there is a reason to show a display string which cannot possibly be part of the disclosure text for some reason?

samuelgoto commented 1 month ago

I guess the question is whether there is a reason to show a display string which cannot possibly be part of the disclosure text for some reason?

I believe so: @solatsuta runs an IdP that I believe should be able to disambiguate accounts by a "gamer tag" (so, would want to use a displayName) but disallow it to be disclosed (e.g. avoid it being requested in fields).

npm1 commented 1 month ago

I mean that seems to fit the username field and they can presumably already reject requests which have fields that are not meant to be shared. So I don't think that is a convincing example.

samuelgoto commented 1 month ago

I think the suggestion from @hlflanagan in the CG call and from @bc-pi and @aaronpk in this thread is to support displayName as an alternative to username, because it is more general purpose.

@hlflanagan brought up a use case in libraries (do i get this right, @hlflanagan ?) where library card numbers are human readable but aren't necessarily guaranteed to be globally unique usernames (e.g. they can be globally unique within a library district but not in the system).

Same goes for "gamer tags": they don't necessarily need to be globally unique, but they are still useful for account disambiguation.

npm1 commented 1 month ago

Using a 'display name' that could be repeated in multiple accounts of the same IDP would be a downside, not something I would want to encourage. Because then some users would have no way to know which account they are selecting. This is my problem with this string: sure it does allow arbitrary use cases, but it is vague enough that it encourages developers to put arbitrary data there, perhaps data that would often not be enough to disambiguate accounts.

timcappalli commented 1 month ago

Just for reference of some existing art, WebAuthn credentials use the following structure:

"user": {
  "id": "Iur3CDuIU6INagDNqCSw5jbskmjNJXnmiQvBSDA5cpY",
  "name": "tim@cloudauth.dev",
  "displayName": "Tim Cappalli"
}

id is required and is the unique identifier for the account at the RP (if it were to be used in FedCM, it would be directed/pairwise) name is the primary account identifier displayed to the user displayName is an optional field that can be displayed by clients to in addition to name (currently only GPM on Android shows displayName)

Just a note, I do have this as an item for the larger discussion at TPAC: https://github.com/w3c/tpac2024-breakouts/issues/49

w3c-fedid / FedCM

Allow selecting an account without email addresses #639