dfinity / wg-identity-authentication

Repository of the Identity and Wallet Standards Working Group

https://wiki.internetcomputer.org/wiki/Identity_%26_Authentication

Apache License 2.0

33 stars 10 forks source link

ICRC-25: Restructuring of the `permission` request #65

Closed frederikrothenberger closed 1 year ago

frederikrothenberger commented 1 year ago

In the working group session on October 3 (recording), it was suggested by @plitzenberger to restructure the ICRC-25 permission message. The feedback was that all messages should only have a single functionality (e.g. similar to the way Ethereum JSON-RPC APIs are designed).

Currently, the permission request:

asks for permissions
exchanges information about the identities managed by the signer
extensions (such as ICRC-27) would add additional information to the same message

The alternative approach is:

only handle permissions in the permission request
introduce a separate message to exchange information about the identities
extensions should not extend existing messages but introduce new messages.

This issue should be used to discuss, how we want to proceed with the messages and which set of messages should end up in the ICRC-25 standard.

Thanks a lot @plitzenberger for the feedback!

CC: @dostro, @jsamol, @sea-snake

frederikrothenberger commented 1 year ago

IMHO we should split the messages.

But we should also keep the permission scopes and session semantics, because they provide a robust basis for secure interactions. Especially the fact, that all permissions given can be revoked and that they expire automatically after some time. By introducing more messages, we also have the opportunity to have permissions / scopes on a per-message level: i.e. the scope foo grants the permission for the foo JSON-RPC call, which would be a nice and intuitive design.

However, we need to keep the user interactions in mind. So far, for n canister_call messages, I was assuming n+1 user prompts:

One approve the connection and to select the identities (and potentially subaccounts)
One per canister call, to approve individually

If we split the first message into multiple messages, do we get more prompts (i.e. one per message)? Or do we only have a single user interaction on the permission request, where the user grants the permission and chooses, which information to attach to the scopes?

Example:

Variant 1: The relying party asks for get_principals scope. The users grants the permission on the initial connection screen. Later, when the relying party calls get_principals the users gets prompted to select the identities they wish to share.
Variant 2: The relying party asks for get_principals scope. The user immediately selects, which identities they wish to attach to that scope for the duration of the session. Later, when the relying party calls get_principals the singer returns the preselected identities without prompting the user.
Variant 3: Leave the choice between variant 1 and 2 open (i.e. not specified by ICRC-25) and make it an implementation detail of the signer.

For me, if we go with the "many small and specific messages"-design, then only approach 2 seems to scale nicely. Otherwise, you might imagine getting tons of user prompts for get_principals, get_icrc_1_subaccounts, get_icrc_7_subaccounts, ...

What is your take on that @plitzenberger?

jsamol commented 1 year ago

I gave some thoughts to the concerns that were brought up yesterday and now to what you said @frederikrothenberger and I'm still not convinced we should split it 🤔

First of all, the scopes and whether the relying party can act on them can be imposed only in the signer. This means that once the relying party, assuming it got the get_permission scope granted first, gets the user's identities, there's nothing that stops it from keeping using them even after the user has revoked that permission. Therefore, I would say that there would be scopes that are not revocable and would have to remain granted throughout the whole session lifetime (I would even argue here that their lifetime exceeds the session's lifetime as "static" data, such as identity details, once shared remains shared).

I'm not sure yet, if we should treat revocable and non-revocable scopes differently on the granting level or they should be just properly documented in the standard specs, but it appears to me that permission-wise there's no benefit of having such non-revocable scopes as separate messages, because they will always be allowed to execute.

Secondly, let's talk about the user experience 😄 As a user, I expect such a standard to allow me to seamlessly connect a dapp with my wallet in a way it almost feels they are the same application and the dapp is an extension of my wallet. This means that, at least in most cases, I want to be able to see my account details and perform actions tailored for it in the dapp. This further means that in these cases the relying party will always send 2 messages, first to open the session and get permission and then to get the identity details. Now, depending on the transport used, this can involve network traffic and result in potentially inconsistent and confusing state where, for example, the signer has received the permission request, the user selected the identities and approved the request, this went through, but the second request didn't and now the dapp says the user has connected but the selected identity details are missing. What I'm trying to say here is that opening a session, defining its contract (asking for and granting permission) and getting the user's identities seems intertwined and therefore be better executed as a single atomic operation.

For me, the revocable scopes should remain a separate messages but non-revocable ones could imply, and this is something that would apply to the standard extensions, that the requested data must be returned in the permission (open session) response. This would correctly suggest that its lifetime and validity matches the session's lifetime and validity and give the relying party access to it the moment the user has agreed on sharing it.

frederikrothenberger commented 1 year ago

Thanks for the input @jsamol! 🙂

I think we have to untangle 3 things going on here:

What is a scope and what does revocation mean?
How do we want to structure the messages?
How are multiple consecutive messages processed?

1: Scopes and Revocation

First of all, the scopes and whether the relying party can act on them can be imposed only in the signer. This means that once the relying party, assuming it got the get_permission scope granted first, gets the user's identities, there's nothing that stops it from keeping using them even after the user has revoked that permission.

I would really take care to only use scopes for permissions and not for information. So if we do make icrc_25_get_principals a scope, then the scope should have the following semantics:

when granted, the relying party has the permission to call the icrc_25_get_principals method. The relying party can do so repeatedly.
when revoked, the relying party can no longer call the icrc_25_get_principals method. In particular, the relying party cannot obtain a fresher version of that information.

Introducing non-revocable scopes seems like a bad idea to me and will break with expectations of developers.

As an alternative, we can simply transfer the information about principals once when initiating the connection. I think in that case we should make that information a separate concept that is not a scope.

2: Message Structure

Essentially, here the question is whether we want one icrc25_initiate_connection message that bundles:

permissions / scopes
information about various things (unbounded because the standard needs to be extensible)

Or whether we want to split this into multiple messages:

one to ask for permissions / scopes
many subsequent messages to ask for specific things, initiate specific actions, etc. Examples:
1. icrc25_permission
2. icrc25_get_principals
3. icrc27_get_subaccounts

Note that the number of messages is independent on how they are processed. See next point.

3: Processing of Multiple Messages

What I'm trying to say here is that opening a session, defining its contract (asking for and granting permission) and getting the user's identities seems intertwined and therefore be better executed as a single atomic operation.

We could have this, even with multiple messages, by using JSON-RPC message batches. I.e. send all of these messages (icrc25_permission, icrc25_get_principals, icrc27_get_subaccounts) and receive back the responses in a single batch. This would be a single atomic operation, where the later responses could also be NOT_GRANTED errors if the users never granted the scope for e.g. icrc27_get_subaccounts.

We need to be careful though, to specify clear semantics of batch processing. E.g.:

the messages are processed sequentially in order of message id
the messages need to be "batchable", i.e. not rely on the output of other messages you would like to process in the same batch. So for icrc27_get_subaccounts it would be better to not take a principal as an argument, but rather return all subaccounts of the principals that were granted access to in this ICRC-25 session.

This way, at least to me, it seems that we can unify the desire for specific messages with the requirement of "few" interactions that make sense from a user experience perspective. What do you think, @jsamol?

jsamol commented 1 year ago

1: Scopes and Revocation

when revoked, the relying party can no longer call the icrc_25_get_principals method. In particular, the relying party cannot obtain a fresher version of that information.

So far we haven't specified any mechanism to change the identities during the session lifetime. The assumption was that a session gets opened for a set of identities initially chosen by the user, and to change it, the user has to open new session from the relying party. This means that at the moment the standard doesn't anticipate that a method such as icrc25_get_principals could return different values throughout a single session's lifetime.

Having that in mind, I think it could give a false sense of security if permission to call such a method was revocable as the relying party keeps having access to data which the user thinks can no longer be accessible.

We could think of a way to let the signer change the identities while keeping the session alive, but, most probably, it would be a rabbit hole I don't think we'd like to go down. For example, if the user changes the identities, they should be made aware of the granted permission scopes and be allowed to revoke some, otherwise they could mistakenly expose their identities to scopes they normally wouldn't. In such a case I think closing a previous session and opening a new one is safer, cleaner and just easier for both the user and developer.

As an alternative, we can simply transfer the information about principals once when initiating the connection. I think in that case we should make that information a separate concept that is not a scope.

Yes, I don't mind coming up with a new concept name here. Maybe we should drop the scope and have method and data instead? The method would be the old scope, while the data would be a place where the relying party defines what I called the non-revocable scopes, i.e. the information it should receive at the beginning of the session and will remain valid for its lifetime.

3: Processing of Multiple Messages

Ah true, JSON-RPC batching sounds like a remedy to my network concerns 😄

2: Message Structure

To me the identity details simply belong with the permissions. In the end the idea is that the user selects the identities they want to share with the relying party and decides to what extent the relying party can use them further. Separating these 2 information feels, in my opinion, kind of artificial and redundant, almost as we split the public key and the challenge 🤔 and by identity details I mean not only the principals as defined by ICRC-25 but also any extension that could be defined later. This means that I'd rather see extensions to the ICRC-25 permission request than have separated methods to get additional details.

I do see the structural benefit of having these messages separated, though. I'm just not a fan of the idea of having messages that return incomplete information and lack context.

jsamol commented 1 year ago

Well, maybe if we stopped identifying the permission request with opening the session and had the session open request defined in the standard as a batch of the permission request call, ICRC-25 get principals/public keys and optional extensions for identity (I'm not sure if that was the intention from the beginning), I could be convinced, in the end we would get a complete response, just scattered between inner responses.

But again, it feels like forcefully splitting something that could have been the same message and exposing messages that don't necessarily make sense on their own (I'm looking at you, permission request, the icrcXX_get_XX methods for identity details could include scopes in their responses which at least would make them complete without any external context).

sea-snake commented 1 year ago

Looking at the multiple RPC messages that would be sent:

icrc25_initiate_connection
icrc25_permission
icrc25_get_principals
icrc27_get_subaccounts

icrc25_initiate_connection and icrc25_permission seem to overlap, and as far as I understand, the icrc25_initiate_connection message is an alternative approach to a batch of e.g. icrc25_permission and icrc25_get_principals.

I would lean towards using the RPC batch standard with multiple messages over having a message that does many things at once. I don't think we should consider the wallet connection in this spec as something stateful like a session that's started and closed, that's the concern of the transport layer not the messages sent within. Instead I would see the relying party and wallet as two actors that communicate with each other where both keep a state indefinitely until one actors asks the other actor to dispose of this state. The transport layer is responsible for identifying the actors between both ends e.g. session token, origin etc.

From that perspective I would say:

icrc25_request_permission
- Request permission for certain scopes (RPC methods)
- Returns scopes allowed
- Can be repeated to ask for additional permissions at a later point in time
- Isn't required to connect to the wallet, some RPC methods might not require a permissions scope
icrc25_get_identities
- I would probably call it identities over principals since it implies the purpose of the principal ids instead of data type.
- Requires an approved permission, not sure if this is something up to the wallet to decide or spec.
- Requests list of all identity principal ids that the user wants to share in the wallet.
- Returns identity principals and challenge signatures.
- Can be repeated to ask for up to date list.
- Wallet can return list of previously selected identities by user so no user interaction is required on the repeated request but this behavior is probably up to the wallet UX design to decide.
- A relying party could set an expiry for a challenge previously sent in it's own data storage and use this repeated request to get signature for a new challenge. Example use case: verify you still have access to this identity.
icrc27_get_icrc1_subaccounts
- Request list of ICRC-1 subaccounts for above identities
- Make account standard explicit in method name?
- Requires permission to identity scope
- Requires permission subaccounts scope (is subaccounts a sub scope of identity?)
- Can be repeated to ask for up to date list.
- Wallet can return list of previously selected subaccounts by user so no user interaction is required on the repeated request but this behavior is probably up to the wallet UX design to decide.

Overall I think behavior should probably be up to the wallet UX, e.g. whether to return a list of all subaccounts or let the user select the subaccounts.

To keep it possible to do all user interaction in a single batch, I would say that the spec does define the required permission scopes for the RPC methods. If a wallet decided to not want to require permission for e.g. getting identities, it can simply implement the permission as something that is automatically approved.

All methods can be repeated, depending on the wallet state and UX, the wallet can decide if user interaction is needed or e.g. previously selection can be sent.

If the wallet connection has been made just now or a week ago and all permissions have been given, doesn't matter. The methods behave the same, response is depending on the wallet state and UX.

As for data returned is earlier RPC calls while permissions have been revoked, this data has left the wallet and can no longer be revoked. But we should probably think about permission dependencies, you wouldn't want a relying party to make a canister call for an identity that has been revoked by the wallet or for any identity if the whole identity scope has been revoked. So we should make sure to define an errors for these scenarios.

frederikrothenberger commented 1 year ago

Thanks a lot for the feedback!

Ok, it seems that we are honing in on a solution:

Proposed changes to ICRC-25:

define messages
- icrc25_request_permission
- icrc25_revoke_permission
- icrc25_get_identities
- icrc25_canister_call
define batch semantics
- This lets a batch of [icrc25_request_permission, icrc25_get_identities] be a single interaction like the permission request is in the current ICRC-25 draft
Define scope to be the name of a JSON-RPC method. Granting a specific scope gives the RP permission to call that JSON-RPC method.
Mandate user approval for each scope at least once. However, a wallet is free to chose when to ask for user approval (and whether to do so repeatedly). I.e. a wallet may ask for user approval on icrc25_request_permission (to select the scopes) and on icrc25_get_identities (to select identities) or only prompt the user on icrc25_request_permission to select both scopes and identities and handle icrc25_get_identities without user approval.
- Exception: the call to icrc25_canister_call must always be approved by the user.

Proposed changes to ICRC-27:

define messages:
- icrc27_get_icrc1_(sub)accounts (and maybe icrc27_(create|register)_icrc1_(sub)account to let relying parties request the creation of a new subaccount / let the signer know about the existence of a specific subaccount).
- maybe calling it icrc27_get_icrc1_accounts is cleaner, because it untangles it from the icrc25_get_identities since the response is then self-contained: i.e. each account is just a the tuple (ledger canister id, principal, subaccount)
The batch of [icrc25_request_permission, icrc25_get_identities, icrc27_get_icrc1_subaccounts] is a single interaction like the extended permission request in the current ICRC-27 draft

Requires permission subaccounts scope (is subaccounts a sub scope of identity?)

I would really like to keep things simple for now and not have relations between scopes. I would leave it up to the signer how to handle revocation (i.e. the signer could bundle multiple scopes in the revocation UI / or simply just offer an "end session" button to the user).

I'll prepare a PR with the proposed changes to ICRC-25, so that we have a more concrete basis to continue the discussion.

frederikrothenberger commented 1 year ago

The changes have been merged in #79.

dfinity / wg-identity-authentication

ICRC-25: Restructuring of the `permission` request #65

How are multiple consecutive messages processed?

1: Scopes and Revocation

2: Message Structure

3: Processing of Multiple Messages

1: Scopes and Revocation

3: Processing of Multiple Messages

2: Message Structure