w3c-fedid / FedCM

A privacy preserving identity exchange Web API
https://w3c-fedid.github.io/FedCM/
Other
383 stars 73 forks source link

Directed identifiers problematic for efforts around portable identities #56

Open tplooker opened 3 years ago

tplooker commented 3 years ago

Currently in the federated identity landscape of today end-user identities are mostly scoped/tied to the IDP that creates and manages them. This can be a problematic relationship because end-users can become somewhat beholden to their IDP. Recently, there has been increased interest and exploration into how to establish identities that can outlive or be ported between their providers by leveraging cryptographically verifiable identifiers for end users (e.g public keys, public key fingerprints or resolvable identifiers like Decentralized Identifiers), evidence of which can be found with SIOP V2, an early stage draft on portable identities and DID SIOP. Because of this it becomes problematic for the browser to enforce directed identifiers as they would remove the ability for the RP to validate proof of control over the subject identifier via a supplied digital signature.

samuelgoto commented 3 years ago

Yep, this is a great point. I'm not entirely convinced that these are entirely conflicting goals, but I agree that there is a tension.

For example, one of the things that occurred to us was along the lines of establishing a convention on how to create a directed identifier. For example, if these were:

SHA256(user@email, rp.com)

Then, independently of which IDP that provides this, that would still lead to the exact same number, making it portable, but still directed.

(we ran into a wall on this scheme here because user@email.com is easily enumerable, so this is easily brute forceable)

The other formulation we occurred to us was to have the browser / user agent do that joining (e.g. have IDPs give user agents a list of all of the [RP, directed identifier] pairs and have the user agent join that list from different IDPs to connect them).

None of these are perfect solutions, but just wanted to be transparent about the lines of explorations.

WDYT?

tplooker commented 3 years ago

Yeap agree, in general what you could say is that if you offload the responsibility of validating the subject (End-User) identifier control to the browser, then the browser could derive a directed identifier for the RP that would remain consistent even if the subject (End-User) changed IDP. I do however think that this in practise could lead to a lot of complexity that the browser must shoulder to accomplish it.

csuwildcat commented 3 years ago

I'll to note that a Decentralized Identifier is, at its core, just a URI scheme that returns a resolved JSON document with some form of verification material proving the ID's state is legitimate. You can model just about any type of identifier and verifiable material combo as a DID. For example did:directidp:SHA256(user@email, rp.com) could be the directed IDP identifier URI formulation and you could return your verification token/material in the DID Document that is 'resolved' by the browser/IDP intermediary.

Now obviously this isn't a 'true' DID in terms of its degree of decentralization, but there are other DID Methods, like did:web, that are also quite centralized. We, as a web community and group of powerful companies, need to be careful things like did:web and 'directed IDP identifiers' are not used for anything critical in the lives of individuals, as a matter of human rights. To illustrate this point, imagine a 'directed IDP identifier' was used as the ID anchor for a person's bank account, food stamps, driver's license, etc., then the person posts something on social media the IDP/intermediaries (who actually own the centralized ID) dislikes, so the IDP/intermediary decides to erase/block everything to do with that person, including their connection to their 'directled IDP identifier'. BOOM, you've just been digitally depersoned, which isn't far fetched given it has happened to millions of people over the last few years. I may not agree politically with those who were erased, but our duty and charge as ethical and responsible technologists is to be aware that things like did:web and 'directed IDP identifiers' have the potential to cause an inadvertent dystopia, and take care to avoid it.

csuwildcat commented 3 years ago

In summary, I see nothing preventing you from modeling your 'directed IDP identifiers' as a DID URIs with a DID Document that includes whatever proofing output you generated. If you create a wholly separate scheme and convention for this, we're just going to have to manage and integrate both separately, resulting in a lot more code for the UAs to write and manage, and potentially a bifurcation of the browser API surface. Having a unified scheme for IDs is the first step, and I am extremely reluctant to support integration in Edge of one-off identifier schemes when we have a unifying convention which is now entering CR and on track for final ratification within the year.

samuelgoto commented 3 years ago

I don't think anyone is rejecting any formatting suggestion/convention. I just don't think that's the hardest problem right now. I believe mathematics (the right crypto) and economics (the right set of incentives and feedback loops) is, so most of the conversation has been around that.

On Sat, Mar 6, 2021, 8:56 AM Daniel Buchner notifications@github.com wrote:

In summary, I see nothing preventing you from modeling your 'directed IDP identifiers' as a DID URIs with a DID Document that includes whatever proofing output you generated. If you create a wholly separate scheme and convention for this, we're just going to have to manage and integrate both separately, resulting in a lot more code for the UAs to write and manage, and potentially a bifurcation of the browser API surface. Having a unified scheme for IDs is the first step, and I am extremely reluctant to support integration in Edge of one-off identifier schemes when we have a unifying convention which is now entering CR and on track for final ratification within the year.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/WICG/WebID/issues/56#issuecomment-791988990, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFJL2TDJAUG3ZOOBX3N3BTTCJNFFANCNFSM4YUKA4OQ .

csuwildcat commented 3 years ago

I believe mathematics (the right crypto) and economics (the right set of incentives and feedback loops) is.

That's absolutely a more critical and difficult piece, but surely you're aware we (as Microsoft and a DID community) have been working for years to get that figured out, and now have solutions that provide the scale, features, and security required for mass deployment, right?

samuelgoto commented 3 years ago

I am genuinely not and would be happy to be informed. Is there something off the shelf that solves our problem to preserve federation with more control over tracking?

I'd be genuinely happy to hear about the math and economics aside of (a) notation and formatting (i.e. explain to me using a basic understand of math but nothing more) (b) solving a different problem other than the preservation of federation.

On Sat, Mar 6, 2021, 10:05 AM Daniel Buchner notifications@github.com wrote:

I believe mathematics (the right crypto) and economics (the right set of incentives and feedback loops) is.

That's absolutely a more critical and difficult piece, but surely you're aware we (as Microsoft and a DID community) have been working for years to get that figured out, and now have solutions for which provide the scale, features, and security required for mass deployment, right?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/WICG/WebID/issues/56#issuecomment-792001641, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFJL2RKXAQKL4NGI645CALTCJVH7ANCNFSM4YUKA4OQ .

csuwildcat commented 3 years ago

@samuelgoto I think we need to see this as two separate concerns that both share a common root:

  1. Use a common URI scheme and resolution output, because creating a bunch of duplicative formats and models is the first obvious, low-hanging thing to avoid.
  2. With IDs/identity, make sure to separate the use of ID mechanisms that are not capable of safely being the foundation for true identity from those that are (while maintaining one common URI scheme and set of data models).
  3. Hopefully, if we can, enable a flow that allows IDPs, RPs, and assertions to be modeled as Verifiable Credentials that can prove the desired facts the involved parties can rely on.

I'd argue we can do 1 and 2 now, and that we have the primitives for 3, but need to get in a room and see if we can reorient these flow to utilize them, vs piling on models that simply are not a good foundation for real digital identity.