Global Discovery of Hyperties per IDP Identifier

pchainho commented 8 years ago

How can I use a User Identifier from an IDP (eg Google email) in order to discover associated Hyperty Instances?

sgoendoer commented 8 years ago

Hello Paulo,

this use case has not really been discussed so far.

Our perspective on user identifiers was so far that "IdP user identifiers" are used independently from the reTHINK architecture. E.g. "signing up with Facebook on reTHINK service X" (as "login credentials" so to speak), but not using this as a way to address/contact other users. From our perspective, a user would always use the GUID (or a known UserID / HypertyURL from a previous connection) to contact another user or device.

This has been described in Deliverable D4.1 (e.g. Section 4.1.6), in the Umbrella Scenario in Aveiro (DT_Methodology alignment0.2.pptx), the Registry discussion in Paris two weeks ago (WP4-registries.pdf) and also in the ICIN paper (Section V). Also, see [https://github.com/reTHINK-project/architecture/issues/71](WP2 issue #71)

Using an identifier issued by an IdP instead of the GUID would cause the following problem - or at least raise some questions: Let's assume, a user Alice would want to use her Facebook IdP-identity (lets call it "IdP-id") "alice@facebook.com" to sign up with several services in reTHINK. This Facebook-identifier would then probably be "linked" to services in different CSP domains. How should this IdP-id be resolved? The only information "in" this IdP-id is the Facebook-domain. Hence, Facebook would need to resolve the id? Or do we need another "domain-independent" registry for resolving IdP-ids? What would happen if an IdP-id exists, but no reTHINK service behind that would be "linked" to that IdP-id? Showing an error message? The following scenario would even be a bit more complicated: What, if an IdP-id exists and there are reTHINK services/accounts "behind" it that are using this IdP-id (but not all of them are). Lets assume, alice@facebook.com has two hyperties "h1" (@csp1.org) and "h2" (@csp2.org). Only h1 uses the Facebook IdP-id, while h1 uses another IdP-id (e.g. alice@wonderland.com). Logically, only h1 should be "found" when searching for "alice@facebook.com". Question is: how?

My proposal would be to NOT use an IdP-id to initiate a connection between reTHINK users/devices, but use the GUID or UserID instead at all times. That's what the whole registry infrastructure was designed for.

Anyhow: If you want to "resolve" a IdP-id, you (as the user) could publish this information (i.e. "IdP-id -> GUID") in the discovery service I guess. Then, a user can discover your GUID and connect to you in the originally planned fashion.

jmcrom commented 8 years ago

First of all, what is the actual need? Except for a self-care need making a list of all hyperties deployed on behalf of a single identity can be seen as a a privacy issue to me...

Second, as Sebastian wrote, the basic idea of registry services is to be domain-independent (both from CSP and IDP domains). Discovery service should provide additional search features.

sgoendoer commented 8 years ago

The privacy issues Jean Michel mentioned are of course another important thing to take into account.

rjflp commented 8 years ago

I agree that the proper place to address this issue would be the Discovery service. Even though not much is known about it, the need @pchainho expresses seems compatible with what I believe will be the services offered by the Discovery Service.

So far, we have agreed that the role of the Global Registry is to provide a mapping between GUID -> UserIDs. These UserIDs map to the CSPs' Domain Registries. Can we derive the Domain Registry from the User Identifier for an IdP? I guess not as these are transversal to the CSPs!

If we ignore the privacy issues pointed out by @jmcrom, we could come up with a technical solution using the Global Registry. We could add another mapping, userID -> GUID, to the Global Registry. But then we have two issues:

The one described by @sgoendoer: you get all the user's hyperties, and not just the ones associated with that IdP.
Who controls the entries? What prevents two users from entering the same Id (e.g. alice@facebook.com) pointing to two different GUIDs (only one would be stored, but they could keep changing the value)? A possible solution would be to only accept the entry IdP->GUID if the entry GUID->IdP exists and the same cryptographic key is used to sign the put request. But nothing stops users from writing whatever they want in their GUID mappings!

Overall, like @sgoendoer, I believe that the best solution for this problem is to use the Discovery Service. If this need is just to provide a stop gap solution while the Discovery Service is unavailable, perhaps we can come up with a better solution, or just wait.

rjflp commented 8 years ago

I was discussing this issue with @Ricardo-Chaves and @nuno-santos and they came up with another proposal for securing the introduction of the new mapping into the DHT.

The idea would still be to introduce a new mapping into the DHT: userID -> GUID We would require the user to provide an assertion proving he owns the userID. The DHT nodes would validate this with the IdP before committing the write. We still have the issue described by @sgoendoer: you get all the user's hyperties, and not just the ones associated with that IdP.

sgoendoer commented 8 years ago

Obviously, we could add a mapping of a (hashed) UserID to a GUID into the DHT. As this mapping rather seldomly changes, this probably wouldnt put too much additional load on the service.

This would mean that we would have two different data objects in the global registry. The one we have right now (GUID -> UserID) and (possibly multiple versions of) the reverse one (UserID -> GUID).

Regarding the "proof of authenticity":

The dataset "GUID -> UserID" is self-signed, the lookup key ( == GUID) is bound to the key and hence cannot be changed. If we add another dataset "UserID -> GUID", we can obviously sign this new dataset with the same key. Anyhow, anyone could just create those datasets with a valid signature.

E.g.: Bob creates a (unauthorized) version for Alice: "AliceUserId -> BobGUID". As Alice's UserID is hashed and used as the lookup key in the DHT, a "valid" version created by Alice would be overwritten by the faked version of Bob. From then on, Alice's UserID would point to Bob's GUID data record, and hence his key is linked there, Bob would also be able to create a valid Signature for the faked mapping.

rjflp commented 8 years ago

@sgoendoer We want to make sure that no one is able to steal our "traffic". If we have a mapping UserID -> GUID but only verify the GUID (by checking that the same key is being used), then, as you describe, anyone with a GUID could steal someone else's "traffic" by creating a mapping SomeElse'sUserID -> MyGUID.

On the other hand, if we only verify that someone owns the UserID, one could forward searches for that UserID to some else's GUID.

Perhaps the best solution would be to verify that:

The user owns the UserID (the IdP would validate an assertion)
The user owns the GUID (the write operation is signed with a key so that hash(public key) = GUID)
There already exists at least one mapping GUID -> UserId@SP

rjflp commented 7 years ago

The solution I proposed in the previous post does not work as the IdP would only hold the assertion for a few minutes.

reTHINK-project / dev-registry-global

Global Discovery of Hyperties per IDP Identifier #11