dfinity / wg-identity-authentication

Repository of the Identity and Wallet Standards Working Group
https://wiki.internetcomputer.org/wiki/Identity_%26_Authentication
Apache License 2.0
27 stars 8 forks source link

ICRC-25: Restructuring of the `permission` request #65

Closed frederikrothenberger closed 8 months ago

frederikrothenberger commented 9 months ago

In the working group session on October 3 (recording), it was suggested by @plitzenberger to restructure the ICRC-25 permission message. The feedback was that all messages should only have a single functionality (e.g. similar to the way Ethereum JSON-RPC APIs are designed).

Currently, the permission request:

The alternative approach is:

This issue should be used to discuss, how we want to proceed with the messages and which set of messages should end up in the ICRC-25 standard.

Thanks a lot @plitzenberger for the feedback!

CC: @dostro, @jsamol, @sea-snake

frederikrothenberger commented 9 months ago

IMHO we should split the messages.

But we should also keep the permission scopes and session semantics, because they provide a robust basis for secure interactions. Especially the fact, that all permissions given can be revoked and that they expire automatically after some time. By introducing more messages, we also have the opportunity to have permissions / scopes on a per-message level: i.e. the scope foo grants the permission for the foo JSON-RPC call, which would be a nice and intuitive design.

However, we need to keep the user interactions in mind. So far, for n canister_call messages, I was assuming n+1 user prompts:

If we split the first message into multiple messages, do we get more prompts (i.e. one per message)? Or do we only have a single user interaction on the permission request, where the user grants the permission and chooses, which information to attach to the scopes?

Example:

For me, if we go with the "many small and specific messages"-design, then only approach 2 seems to scale nicely. Otherwise, you might imagine getting tons of user prompts for get_principals, get_icrc_1_subaccounts, get_icrc_7_subaccounts, ...

What is your take on that @plitzenberger?

jsamol commented 9 months ago

I gave some thoughts to the concerns that were brought up yesterday and now to what you said @frederikrothenberger and I'm still not convinced we should split it 🤔

First of all, the scopes and whether the relying party can act on them can be imposed only in the signer. This means that once the relying party, assuming it got the get_permission scope granted first, gets the user's identities, there's nothing that stops it from keeping using them even after the user has revoked that permission. Therefore, I would say that there would be scopes that are not revocable and would have to remain granted throughout the whole session lifetime (I would even argue here that their lifetime exceeds the session's lifetime as "static" data, such as identity details, once shared remains shared).

I'm not sure yet, if we should treat revocable and non-revocable scopes differently on the granting level or they should be just properly documented in the standard specs, but it appears to me that permission-wise there's no benefit of having such non-revocable scopes as separate messages, because they will always be allowed to execute.

Secondly, let's talk about the user experience 😄 As a user, I expect such a standard to allow me to seamlessly connect a dapp with my wallet in a way it almost feels they are the same application and the dapp is an extension of my wallet. This means that, at least in most cases, I want to be able to see my account details and perform actions tailored for it in the dapp. This further means that in these cases the relying party will always send 2 messages, first to open the session and get permission and then to get the identity details. Now, depending on the transport used, this can involve network traffic and result in potentially inconsistent and confusing state where, for example, the signer has received the permission request, the user selected the identities and approved the request, this went through, but the second request didn't and now the dapp says the user has connected but the selected identity details are missing. What I'm trying to say here is that opening a session, defining its contract (asking for and granting permission) and getting the user's identities seems intertwined and therefore be better executed as a single atomic operation.

For me, the revocable scopes should remain a separate messages but non-revocable ones could imply, and this is something that would apply to the standard extensions, that the requested data must be returned in the permission (open session) response. This would correctly suggest that its lifetime and validity matches the session's lifetime and validity and give the relying party access to it the moment the user has agreed on sharing it.

frederikrothenberger commented 9 months ago

Thanks for the input @jsamol! 🙂

I think we have to untangle 3 things going on here:

  1. What is a scope and what does revocation mean?
  2. How do we want to structure the messages?
  3. How are multiple consecutive messages processed?

    1: Scopes and Revocation

First of all, the scopes and whether the relying party can act on them can be imposed only in the signer. This means that once the relying party, assuming it got the get_permission scope granted first, gets the user's identities, there's nothing that stops it from keeping using them even after the user has revoked that permission.

I would really take care to only use scopes for permissions and not for information. So if we do make icrc_25_get_principals a scope, then the scope should have the following semantics:

Introducing non-revocable scopes seems like a bad idea to me and will break with expectations of developers.


As an alternative, we can simply transfer the information about principals once when initiating the connection. I think in that case we should make that information a separate concept that is not a scope.

2: Message Structure

Essentially, here the question is whether we want one icrc25_initiate_connection message that bundles:

Or whether we want to split this into multiple messages:

Note that the number of messages is independent on how they are processed. See next point.

3: Processing of Multiple Messages

What I'm trying to say here is that opening a session, defining its contract (asking for and granting permission) and getting the user's identities seems intertwined and therefore be better executed as a single atomic operation.

We could have this, even with multiple messages, by using JSON-RPC message batches. I.e. send all of these messages (icrc25_permission, icrc25_get_principals, icrc27_get_subaccounts) and receive back the responses in a single batch. This would be a single atomic operation, where the later responses could also be NOT_GRANTED errors if the users never granted the scope for e.g. icrc27_get_subaccounts.

We need to be careful though, to specify clear semantics of batch processing. E.g.:


This way, at least to me, it seems that we can unify the desire for specific messages with the requirement of "few" interactions that make sense from a user experience perspective. What do you think, @jsamol?

jsamol commented 9 months ago

1: Scopes and Revocation

when revoked, the relying party can no longer call the icrc_25_get_principals method. In particular, the relying party cannot obtain a fresher version of that information.

So far we haven't specified any mechanism to change the identities during the session lifetime. The assumption was that a session gets opened for a set of identities initially chosen by the user, and to change it, the user has to open new session from the relying party. This means that at the moment the standard doesn't anticipate that a method such as icrc25_get_principals could return different values throughout a single session's lifetime.

Having that in mind, I think it could give a false sense of security if permission to call such a method was revocable as the relying party keeps having access to data which the user thinks can no longer be accessible.

We could think of a way to let the signer change the identities while keeping the session alive, but, most probably, it would be a rabbit hole I don't think we'd like to go down. For example, if the user changes the identities, they should be made aware of the granted permission scopes and be allowed to revoke some, otherwise they could mistakenly expose their identities to scopes they normally wouldn't. In such a case I think closing a previous session and opening a new one is safer, cleaner and just easier for both the user and developer.

As an alternative, we can simply transfer the information about principals once when initiating the connection. I think in that case we should make that information a separate concept that is not a scope.

Yes, I don't mind coming up with a new concept name here. Maybe we should drop the scope and have method and data instead? The method would be the old scope, while the data would be a place where the relying party defines what I called the non-revocable scopes, i.e. the information it should receive at the beginning of the session and will remain valid for its lifetime.

3: Processing of Multiple Messages

Ah true, JSON-RPC batching sounds like a remedy to my network concerns 😄

2: Message Structure

To me the identity details simply belong with the permissions. In the end the idea is that the user selects the identities they want to share with the relying party and decides to what extent the relying party can use them further. Separating these 2 information feels, in my opinion, kind of artificial and redundant, almost as we split the public key and the challenge 🤔 and by identity details I mean not only the principals as defined by ICRC-25 but also any extension that could be defined later. This means that I'd rather see extensions to the ICRC-25 permission request than have separated methods to get additional details.

I do see the structural benefit of having these messages separated, though. I'm just not a fan of the idea of having messages that return incomplete information and lack context.

jsamol commented 9 months ago

Well, maybe if we stopped identifying the permission request with opening the session and had the session open request defined in the standard as a batch of the permission request call, ICRC-25 get principals/public keys and optional extensions for identity (I'm not sure if that was the intention from the beginning), I could be convinced, in the end we would get a complete response, just scattered between inner responses.

But again, it feels like forcefully splitting something that could have been the same message and exposing messages that don't necessarily make sense on their own (I'm looking at you, permission request, the icrcXX_get_XX methods for identity details could include scopes in their responses which at least would make them complete without any external context).

sea-snake commented 9 months ago

Looking at the multiple RPC messages that would be sent:

icrc25_initiate_connection and icrc25_permission seem to overlap, and as far as I understand, the icrc25_initiate_connection message is an alternative approach to a batch of e.g. icrc25_permission and icrc25_get_principals.

I would lean towards using the RPC batch standard with multiple messages over having a message that does many things at once. I don't think we should consider the wallet connection in this spec as something stateful like a session that's started and closed, that's the concern of the transport layer not the messages sent within. Instead I would see the relying party and wallet as two actors that communicate with each other where both keep a state indefinitely until one actors asks the other actor to dispose of this state. The transport layer is responsible for identifying the actors between both ends e.g. session token, origin etc.

From that perspective I would say:

Overall I think behavior should probably be up to the wallet UX, e.g. whether to return a list of all subaccounts or let the user select the subaccounts.

To keep it possible to do all user interaction in a single batch, I would say that the spec does define the required permission scopes for the RPC methods. If a wallet decided to not want to require permission for e.g. getting identities, it can simply implement the permission as something that is automatically approved.

All methods can be repeated, depending on the wallet state and UX, the wallet can decide if user interaction is needed or e.g. previously selection can be sent.

If the wallet connection has been made just now or a week ago and all permissions have been given, doesn't matter. The methods behave the same, response is depending on the wallet state and UX.

As for data returned is earlier RPC calls while permissions have been revoked, this data has left the wallet and can no longer be revoked. But we should probably think about permission dependencies, you wouldn't want a relying party to make a canister call for an identity that has been revoked by the wallet or for any identity if the whole identity scope has been revoked. So we should make sure to define an errors for these scenarios.

frederikrothenberger commented 9 months ago

Thanks a lot for the feedback!

Ok, it seems that we are honing in on a solution:

Proposed changes to ICRC-25:

Proposed changes to ICRC-27:

Requires permission subaccounts scope (is subaccounts a sub scope of identity?)

I would really like to keep things simple for now and not have relations between scopes. I would leave it up to the signer how to handle revocation (i.e. the signer could bundle multiple scopes in the revocation UI / or simply just offer an "end session" button to the user).

I'll prepare a PR with the proposed changes to ICRC-25, so that we have a more concrete basis to continue the discussion.

frederikrothenberger commented 8 months ago

The changes have been merged in #79.