dfinity / wg-identity-authentication

Repository of the Identity and Wallet Standards Working Group
https://wiki.internetcomputer.org/wiki/Identity_%26_Authentication
Apache License 2.0
26 stars 8 forks source link

add icrc-35 #118

Closed seniorjoinu closed 5 months ago

seniorjoinu commented 6 months ago

Read the rationale first, then read the specification itself. Reference implementation is available here.

Tracking issue for discussions is here.

seniorjoinu commented 6 months ago

The rationale is not intended for merging - it is here just for you to be able to easily find it. Let me know if everybody got familiar with it and I'll remove it from the PR.

frederikrothenberger commented 6 months ago

Hi @seniorjoinu

Thanks a lot for this contribution! I've read the documents with great interest. 🙂

I've looked at the proposed solution and it seems to me that it has quite a few similarities with ICRC-25:

If we replace the word signer with service provider ICRC-25 and ICRC-35 are very close.

Looking at your perfect scenario, this could absolutely be implemented as an ICRC-25 extension that defines the appropriate file methods. Note that generic canister calls have been split into ICRC-33.

What ICRC-25 does (after the restructuring) is to define a standard way of establishing a connection, managing permissions (which ICRC-35 will need as well) and defining a baseline requirement for the communication channel.

So my question is: what does ICRC-35 offer over ICRC-25 with a service specific extension (maybe apart from the term signer)?

Btw. I don't quite understand the criticism here regarding the permission messages. Could you elaborate why ICRC-21 is not sufficient? It exactly offers a human-readable and easily understandable, service-specific consent message.


Could you also elaborate a bit on the interoperability story? The perfect scenario gives a vision of a SC that has integrated with two specific SPs (Drive.ic and Dropbox.ic). This means, it is not solely a user-choice which SPs are allowed. I.e. if I use a third (different file storage solution, or even a self-hosted one) I cannot use it unless the SC specifically supports it.

To allow an SC to integrate with any user-provided SP the interactions need to be standardized. But the proposed interactions (by looking at the example) are very service specific. I.e. calling another service requires the SC to know the exact routes and parameters. It seems to me that having true interoperability between SCs and SPs is not possible with the proposed solution, or will require a huge amount of standardization...

This is why we took a more generic approach with ICRC-33: it allows calling services that the signer does not know and does not have a specific integration with.


I'm very happy that you are joining the discussion and I think we should try to unify the efforts. There seems to be enough overlap for that between ICRC-25 and ICRC-35.

As for the details of the spec, I have a few more questions:

seniorjoinu commented 6 months ago

Hey @frederikrothenberger, Thanks for taking the time and reading through the docs!

If we replace the word signer with service provider ICRC-25 and ICRC-35 are very close.

I don't see that. I'm unable to read that from the docs of ICRC-25, since they are about sessions, scopes and permissions (and how they allow better interoperability), while ICRC-35 docs are about the browser-based communication channel, which comes before sessions, scopes and permissions (and the documents state that we actually don't need these things in order to achieve what we want). Is there another version of the ICRC-25 spec, which is not in master? Maybe I'm just looking at a wrong place.

I do see a lot of similarities between ICRC-35 and ICRC-29 though. But in that comparison, I believe, ICRC-35 offers a much deeper and well described specification of a transport channel, than ICRC-29.

So my question is: what does ICRC-35 offer over ICRC-25 with a service specific extension (maybe apart from the term signer)?

Considering what I mentioned above, they are two completely different approaches to the same problem. ICRC-25 offers interoperability through canister calls approach, while ICRC-35 offers interoperability through webpage calls. What I see as a major problem of ICRC-25, and why I propose this specification to be considered as an alternative, is that ICRC-25 allows dapps to access to user's identities from other dapps. In other words, it allows users to manually select identities they want to use to call a particular canister method. I recognize this as an approach similar to what is currently adopted by other Web3 platforms and wallets, such as MetaMask. And this is why I believe this approach is prone to signature stealing attacks, since it just inherits this problem from those wallets.

Yes, ICRC-25 does a much greater job at preventing this kind of attacks via sessions, scopes and permissions (comparing to MetaMask, for example), but it does not elliminate them. It doesn't matter how hard we try to improve it, there always will be a theoretical possibility to execute such an attack successfully. After a lot of thought about why can't we elliminate them, I came to a conclusion that the key reason for that is the power it gives to an end user. With great power comes great responsibility. And for some fraction of people (technically and mentally prepared to always stay alarmed for possible scam) this is an appropriate solution, since they can handle the responsibility. But this is not an appropriate solution for mass adoption, because regular folks are not technically prepared and don't have a capacity to always stay focused while surfing online.

So, while some people might see the unlimited power offered to an end user of ICRC-25 as a useful feature, I see it as a security flaw. Hackers (scammers) attack people much more often than computer systems. This makes me think that giving so much unregulated power to regular folks will only do harm. Which means, that people should not be able to manually select identities to call canisters, and they don't have to know about any of that under-the-hood stuff at all.

This is why I also have doubts about the UX capabilities of ICRC-25 and related specs:

Btw. I don't quite understand the criticism here regarding the permission messages. Could you elaborate why ICRC-21 is not sufficient? It exactly offers a human-readable and easily understandable, service-specific consent message.

Under-the-hood stuff, when presented to the end user, breaks the abstraction. Broken abstraction raises questions, which overheat the brain. Overheaten brain loses focus and gets scammed.

ICRC-35 offers a different integration mechanism, which preserves the abstraction, because it does not rely on canister APIs. Instead, it relies on webpage APIs - a new type of API, which can be used by one webapp to execute user-local actions on another webapp. ICRC-35 describes the protocol and mechanics for defining and utilizing such webpage APIs properly. These APIs, do not require user authorization (signing a request) - because they happen locally in-browser and they only exist in the context of the current user. This means, that these webpage APIs can be invoked without any assistance from a signer party.

To allow an SC to integrate with any user-provided SP the interactions need to be standardized. But the proposed interactions (by looking at the example) are very service specific. I.e. calling another service requires the SC to know the exact routes and parameters.

Webpage APIs hide all the internal complexities from its clients: If a business-action requires 50 different canister calls (which can theoretically happen, but I'm obviosly using this number to exaggerate), the developer on the other side is no longer required to learn this complexity - they can just execute a single webpage API and be sure that everything will go as expected. Webpage APIs solve the signature stealing vulnerability completely, because the user now can use a completely isolated set of identities for each website they communicate with (how II currently works). Moreover, ICRC-35 offers a much better control over the integration surface to service providers, allowing them to explicitly define APIs for allowed actions, while preventing unallowed actions (which might break user data, for example) from happening.

So, answering your original question once again - ICRC-35 offers you a completely different image of future than ICRC-25.

Could you also elaborate a bit on the interoperability story? The perfect scenario gives a vision of a SC that has integrated with two specific SPs (Drive.ic and Dropbox.ic). This means, it is not solely a user-choice which SPs are allowed. I.e. if I use a third (different file storage solution, or even a self-hosted one) I cannot use it unless the SC specifically supports it.

Yes, you're right. As was already mentioned, ICRC-35 limits the power of an end user, simultaneosly limiting the responsibility. The perfect scenario is purely syntetic and is only there to illustrate the end goal of this work. You're raising real world questions, which do not apply for this scenario. So let's infuse a little bit of real life into the scenario as well, in order to answer those questions.

Let's imagine there exist three real dapps deployed to the IC: Photoshop.ic, Dropbox.ic and Drive.ic. Photoshop.ic integrates with the rest two the same way as described in the rationale doc. And let's imagine there is also a third data storage provider, which Photoshop.ic, for some reason, does not integrate with - Disk.ic. I see three main reasons for Photoshop.ic to not integrate with Disk.ic:

  1. (indifferent) They don't know about this SP yet. In that case the solution is simple. Photoshop.ic is an IC DAO-governed software. If the DAO decides to integrate with this SP, the devs will execute. Users who have subscribtions to Disk.ic and who want to use it as their data provider, should just create and vote on a proposal to do that.
  2. (malicious SP) They know about this SP, but they believe it is malicious. In that case, Photoshop.ic is doing a great job protecting their users from malicious software, by not letting them use it with their own software. I don't see a problem here. Photoshop.ic team should just publish some kind of official report and in general be transparent about their decisions.
  3. (malicious SC) They know about this SP, but they have their corrupted motives to not integrate with it. This case is the most interesting one and it requires us to learn more about this new Web3 world. All of our dapps are web-based and DAO-governed. It means, that their software is 100% open-source. It means, that there is no technical know-how component in their success - every know-how gets published and re-used in hundreds of other dapps as soon as it appears. This means, that their success is driven purely by the loyalty of their communities. It means, that these dapps are motivated to always keep their communities happy - otherwise they will switch to a "clone" that offers better conditions. And if this Disk.ic is good and Photoshop.ic's users want it integrated, the team have to obey, to keep thier users happy and their business successful. Otherwise, some other Better-Photoshop.ic will appear, who will copy the codebase, fullfilling all the wishes of the community and stealing the userbase, because it is very cheap to spin up a clone (especially in a world, where you only have to deliver your own business logic and everything else is available through integrations).

So, answering your question. Even if such a situation happens, so there is something the user can't control directly, they can still control it indirectly, via other tools offered by the Web3.

You did not use JSON-RPC for the messages. Is there a specific reason for that?

Yes. JSON-RPC only allows JSON-types: string, number, boolean, undefined, null, array and object. A low-level transport protocol like ICRC-35 should provide sufficient performance for real-time applications, so bytearrays (Uint8Array) is a must for transferring media files. Passing a bytearray with JSON-RPC is only possible by encoding it into a string or array of numbers first. which would slower the communication, because of these extra encoding/decoding steps,and also would not allow using Transferrable objects API optimization for postMessage. JSON-RPC does not offer anything special in return other than maturity. Also, for a low-level protocol as ICRC-35 it is important to allow downstream subprotocols to define their own communication schemas freely. In other words, downstream clients of ICRC-35 can use JSON-RPC as their message format, while a JSON-RPC-based protocol would only allow a subset of JSON used downstream.

I don't think the handshake is necessary. Internet Identity learns the origin from the post MessageEvent.origin. This should work in the context of ICRC-35 too, no?

The handshake serves two purposes.

First of all, it implements best practice for connection establishment. A parent window spawns a child, and the child has to send the first message to signal the readiness for communication, because the parent does not know when the child is loaded and ready. Not everyone realizes that and I thought that it would be nice, if the standard would do that out-of-the-box. So, if Internet Identity also sends a ready message to its parent window, once loaded, you can say that it also implicitly implements a handshake. And judging by ICRC-29 this is exactly the case.

Another purpose of handshake is to establish a secure connection. I assume, by telling this:

Internet Identity learns the origin from the post MessageEvent.origin

You meant that Internet Identity learns the origin of the parent window from the first received valid MessageEvent. So, in theory, another window could send a message to II before the parent.

ICRC-35's handshake disables this theoretical possibility - the child window tells the parent window a secret directly (by sending a message to window.opener). Then the parent window has to echo this secret back, and the child window only learns the parent's origin from this echo-message. So no other window could guess this secret and only the parent is able to communicate with the child. I do realize, that this is an unrealistic attack vector for all the major browsers, but you can't be too careful with security and such a solution adds almost no overhead, because message passing is nearly instant.

  • The fire-and-forget model does not have a route. Wouldn't it be easier for payload parsing if it had one similar to the request-response model?

You're right. I wanted to be very careful with fire-and-forget model, since, I imagine, it would be generally used as a base for alternative models (for example, to pass streamlined data). So I tried to make it as low-level as possible, to not accidentally restrict some use-cases. We should investigate this question further - maybe it would a be a good idea to do what you say and to add a route there as well.

  • What is the motivation for the ping pong messages?

Ping-pong game is used to "keep the connection alive". This is my fault, I'm sorry for that - you would understand it better, if I didn't forget to make the reference implementation public.

Consider this scenario. A user clicks on "Login with Internet Identity" button, which opens the II window. The handshake happens and right after that the user closes the original window, to which they wanted to log in with II to. If that window didn't send any "Connection Closed" message, the II window has no ability to learn that there is no window to communicate back to. This is where ping-pong game is useful - by playing this game, the II window will eventually know that the other window is not responsive and is able to process this information algorithmically, rendering something user-fiendly.

This example is not very illustrative, because nothing bad would happen, if a user closes the original window and goes through the whole authorization flow with II - browser runtime will replace the link to the parent window with null and this situation can be handled by simply checking this link. But you can imagine similar situation in a generic scenario. Where one window gets closed, but the other window waits for more messages from that window, before it can continue (for example, it sent a request and awaits for the response). In that case, there is no way for that promise to resolve, which might lock the business-logic forever. Ping-pong game prevents that.

  • The link to the reference implementation gives me a 404. Probably the repo is still set to private?

Yes. Sorry once again. This is fixed now.

Thank you once again for reading through the documentation. I can really see that you've dived deep into it and it warms my heart. If you have any additional questions, feel free to ask - I'll do my best to try to answer them as good as I can. If you want to have a zoom call, we can also arrange that.

frederikrothenberger commented 6 months ago

Hi @seniorjoinu

If we replace the word signer with service provider ICRC-25 and ICRC-35 are very close.

I don't see that. I'm unable to read that from the docs of ICRC-25, since they are about sessions, scopes and permissions (and how they allow better interoperability), while ICRC-35 docs are about the browser-based communication channel, which comes before sessions, scopes and permissions (and the documents state that we actually don't need these things in order to achieve what we want). Is there another version of the ICRC-25 spec, which is not in master? Maybe I'm just looking at a wrong place.

Maybe I should have clarified. What I meant (on a very high level) is that ICRC-25 using ICRC-29 for transport and ICRC-35 both describe communication between two websites using postMessages.

The interoperability through canister interface is only introduced in ICRC-33. We could as well add an ICRC-25 extension that builds on your proposed interoperability model and defines messages based on routes and an interface distinct from any canister API.

As such, we should try to align ICRC-25 (with ICRC-29) and ICRC-35 so that they are consistent.


Yes, ICRC-25 does a much greater job at preventing this kind of attacks via sessions, scopes and permissions (comparing to MetaMask, for example), but it does not elliminate them. It doesn't matter how hard we try to improve it, there always will be a theoretical possibility to execute such an attack successfully. After a lot of thought about why can't we elliminate them, I came to a conclusion that the key reason for that is the power it gives to an end user. With great power comes great responsibility. And for some fraction of people (technically and mentally prepared to always stay alarmed for possible scam) this is an appropriate solution, since they can handle the responsibility. But this is not an appropriate solution for mass adoption, because regular folks are not technically prepared and don't have a capacity to always stay focused while surfing online.

I think we need to recognize that there is a trade-off here. If we take away too much power from the user, we stifle innovation and make it hard for developers to build new, exciting and composable applications.

We tried finding a good balance between openness, self-sovereignty and security in the ICRC-25 standard. But of course we can still try to improve further. I think this is a good and important discussion to have.

One crucial thing to recognize is that ICRC-25 does not force relying parties to open up to any signer. The standardization just makes it possible. Relying parties can still choose to only accept requests from signers they trust. This is a trade-off they have to make.

In my opinion reputable applications should guide the users regarding the signer choice. There should be recommended signers and a warning if the user chooses a signer that is not well known. We should also consider the possibility of a signer marketplace. This would allow users to choose signers based on their reputation and the services they offer. But ultimately, the user should be able to choose any signer they want. Otherwise, we will remain in a very locked-in ecosystem.

Just as an example, currently there is no way of using MSQ controlled neurons to vote in the NNS dapp. Wouldn't it be nice if users could simply attach MSQ to the NNS dapp and bring in the assets they have there? It would certainly make adoption easier for MSQ.

  1. (malicious SC) They know about this SP, but they have their corrupted motives to not integrate with it. This case is the most interesting one and it requires us to learn more about this new Web3 world. All of our dapps are web-based and DAO-governed. It means, that their software is 100% open-source. It means, that there is no technical know-how component in their success - every know-how gets published and re-used in hundreds of other dapps as soon as it appears. This means, that their success is driven purely by the loyalty of their communities. It means, that these dapps are motivated to always keep their communities happy - otherwise they will switch to a "clone" that offers better conditions. And if this Disk.ic is good and Photoshop.ic's users want it integrated, the team have to obey, to keep thier users happy and their business successful. Otherwise, some other Better-Photoshop.ic will appear, who will copy the codebase, fullfilling all the wishes of the community and stealing the userbase, because it is very cheap to spin up a clone (especially in a world, where you only have to deliver your own business logic and everything else is available through integrations).

I think this represents a very high barrier to entry. Also, there will definitely be widely used dapps that are neither open-source nor decentralized. Just looking at the ecosystem right now, there are many examples of this.

But I think whether the choice of signer / service provider is up to the user or not is independent of the interoperability model. I'd like to separate these two discussions. What do you think?


This is why I also have doubts about the UX capabilities of ICRC-25 and related specs:

Btw. I don't quite understand the criticism here regarding the permission messages. Could you elaborate why ICRC-21 is not sufficient? It exactly offers a human-readable and easily understandable, service-specific consent message.

Under-the-hood stuff, when presented to the end user, breaks the abstraction. Broken abstraction raises questions, which overheat the brain. Overheaten brain loses focus and gets scammed.

While I agree with that statement, I still don't follow why you think this applies to ICRC-21.

ICRC-21 exactly does not expose any under-the-hood stuff to the user. It is explicitly designed to provide a human-readable and easily understandable, service-specific consent message for some user tangible action. Canisters implementing ICRC-21 do have the explicit option of not providing a consent message for canister calls that should not be called by end-users through signers. Reputable signers should not allow to make calls without a consent message.


Internet Identity learns the origin from the post MessageEvent.origin

You meant that Internet Identity learns the origin of the parent window from the first received valid MessageEvent. So, in theory, another window could send a message to II before the parent.

How would that work? Only the parent window has a handle to the II window.


In other words, downstream clients of ICRC-35 can use JSON-RPC as their message format, while a JSON-RPC-based protocol would only allow a subset of JSON used downstream.

That's a good point. We did not consider that yet in the context of ICRC-25. I will bring it up in the next session. @plitzenberger: What is your take on this?


To summarize, apart from the technical details, it seems, we have two bigger questions to discuss in the next working group session:

The answer to these questions is probably more nuanced than a simple decision between ICRC-25 and ICRC-35. I can imagine that there are use-cases that benefit from bespoke integrations with service providers. But I also think that there are many use-cases that can be covered by a generic canister based interface.

seniorjoinu commented 6 months ago

Hey @frederikrothenberger

Just as an example, currently there is no way of using MSQ controlled neurons to vote in the NNS dapp. Wouldn't it be nice if users could simply attach MSQ to the NNS dapp and bring in the assets they have there? It would certainly make adoption easier for MSQ.

Yes, it would be nice. But this is completely possible right now. All we have to do is to add "Login With MSQ" button to the NNS dapp (using the client library MSQ provides). And this responsibility lies on Dfinity - If Dfinity finds MSQ appropriate as an additional authentication provider, they could do this, allowing MSQ users to access NNS. So, it is not users' call, it is Dfinity's call.

And I believe this responsibility distribution is good for everyone. Because if Dfinity does that, this would signal for all the NNS users that Dfinity believes that MSQ is a trustworthy piece of software. And Dfinity would only do that, once they conclude some kind of a review themself.

If by saying that you mean: "But you are, as the MSQ developer, don't you want your users to be able to stake NNS neurons, without you having to make arrangements with Dfinity?". No, I don't. Because I respect Dfinity's intention to protect their users from bad software. I mean, MSQ is good, but guys from Dfinity may have their own opinion about it and they have to be able to not let their users use MSQ to interact with NNS, if they don't want to.

But I think whether the choice of signer / service provider is up to the user or not is independent of the interoperability model. I'd like to separate these two discussions. What do you think?

Agree.

While I agree with that statement, I still don't follow why you think this applies to ICRC-21.

Just as I noted in the original text: My belief is that ICRC-25 Consent Messages won't help with that. Imagine every dapp on the IC implements these consent messages. Users see these pop-ups 100 times a day every day, because on the IC people sign requests much more frequently. People tend to tire from this kind of stuff very quickly. After some time these messages will be perceived as white noise and users will ask for a 'Disable warnings checkbox' feature from their wallet providers.

Elaborating on that, the problem is not in the content of consent messages - that part I'm completely happy with.

It was kinda hard to read that from the ICRC-21 docs, but it seems like a canister, in order to be interoperable, has to implement consent messages for every UPDATE function it has. I believe that will cause a lot of problems.

ICRC-21 treats IC as if it was Ethereum - a network with very infrequent (relative to Web2) transactions, while one of IC's main promises are high performance and infinite throughput. On IC a user may send a lot of transactions (update msgs), visiting a dapp, without doing anything important. For example, HotOrNot counts views for each video. This can be optimized, but in a very naive implementation this would mean that each time a user sees a video, they have to send an UPDATE call. If each such call is accompanied with a consent message, that would be unpleasant.

How would that work? Only the parent window has a handle to the II window.

You're right. I didn't know sending the handle to other browser windows is disabled. I will remove secret-related logic from the handshake. Thank you!

As such, we should try to align ICRC-25 (with ICRC-29) and ICRC-35 so that they are consistent.

The answer to these questions is probably more nuanced than a simple decision between ICRC-25 and ICRC-35. I can imagine that there are use-cases that benefit from bespoke integrations with service providers. But I also think that there are many use-cases that can be covered by a generic canister based interface.

At this moment, the best way to align these two visions, as I see it, would be to merge ICRC-29 with ICRC-35. For signer-related standards, this merged standard would serve as a transport layer. But for the others it would keep the door for webpage API based integrations open.

I could close this PR and re-open it, replacing all 'ICRC-35' entries with 'ICRC-29'. I could also release the '35' number, so others could re-use it. Let me know, what you think.

frederikrothenberger commented 6 months ago

Hi @seniorjoinu

If by saying that you mean: "But you are, as the MSQ developer, don't you want your users to be able to stake NNS neurons, without you having to make arrangements with Dfinity?". No, I don't. Because I respect Dfinity's intention to protect their users from bad software. I mean, MSQ is good, but guys from Dfinity may have their own opinion about it and they have to be able to not let their users use MSQ to interact with NNS, if they don't want to.

That was what I meant. But I respect that stance. :slightly_smiling_face:

ICRC-21 treats IC as if it was Ethereum - a network with very infrequent (relative to Web2) transactions, while one of IC's main promises are high performance and infinite throughput. On IC a user may send a lot of transactions (update msgs), visiting a dapp, without doing anything important. For example, HotOrNot counts views for each video. This can be optimized, but in a very naive implementation this would mean that each time a user sees a video, they have to send an UPDATE call. If each such call is accompanied with a consent message, that would be unpleasant.

Ah, now I understand the issue. So the intention never was for ICRC-25 to be the only mechanism for sending transactions. Rather, it would be an additional mechanism with extra security (for high value transactions only).

So a dapp would authenticate with some IDP and then connect the signer (or later, once supported by https://github.com/dfinity/wg-identity-authentication/blob/main/topics/icrc_34_get_delegation.md authenticate through the signer). Either way, the dapp would end up in a state where it has

A dapp should do the dapp internal updates / maintain session state through the delegation identity, without asking for consent. Only when the dapp wants tap into shared assets (e.g. the ICP balance, modify ownership of a signer controlled canister, disburse maturity from a signer controlled neuron, etc.), there would be a user consent pop-up.

A fair criticism of that model is that it may not be straight forward to split a dapp into a "low value & high transaction volume" and "high value & low transaction volume" parts. However, even just splitting out the obvious parts (e.g. token transactions) would be a big improvement: tokens could no longer be stolen if the dapp is compromised.

As the model matures, I think there will be more example and best practices for devs to follow on how to split the transactions among the delegation identity and the signer identity.

At this moment, the best way to align these two visions, as I see it, would be to merge ICRC-29 with ICRC-35. For signer-related standards, this merged standard would serve as a transport layer. But for the others it would keep the door for webpage API based integrations open.

Currently, ICRC-29 is laser focused on the transport mechanism only (window post messages). Message content is out of scope and defined in ICRC-25 and its extensions. I like that model, because it allows to swap the transport mechanism independently of the message content.

Is there a reason why ICRC-35 is scoped to window post messages, rather than a more abstract notion of a transport mechanism? There might be value in having other transport channels available for ICRC-35 too, no? For example app switch on mobile?

Looking forward to discuss this further in tomorrows working group session. I hope you'll attend!

seniorjoinu commented 6 months ago

Hi @frederikrothenberger

A dapp should do the dapp internal updates / maintain session state through the delegation identity, without asking for consent. Only when the dapp wants tap into shared assets (e.g. the ICP balance, modify ownership of a signer controlled canister, disburse maturity from a signer controlled neuron, etc.), there would be a user consent pop-up.

First of all, thank you for telling this. As a newcomer, it was hard for me to read this from the docs. I would suggest to include this information, about how all the signer-related standards should interact with each other in practice, into the overview documentation. I mean, like a section written in very simple words, so everybody could read and understand the main idea from such a description. Including details (which might look obvious for the WG members, but are not for outsiders) like: Does that mean, that all the other standards also only operate with these signer identities? Are signer identities scoped or shared across all dapps?

So, basically the idea is for dapps (SPs) to prepare an integration surface (canister API) in advance. Such a surface should be well documented, covered with consent message endpoints and only accessible through a signer identity. Each dapp has its own implementation details, so API of each SP may differ, even if they provide very similar services. These APIs can only converge to a single one, if they are standardized. Other dapps (SCs) would have to study APIs of each SP and integrate with each of these APIs manually. For that they would request an identity from a signer that is responsible for making calls for each SP and, with the user's blessing, use that identity to interact with those SPs.

Is this correct?

If that is correct, then how ICRC-33 is different from ICRC-35 in terms of generalization:

To allow an SC to integrate with any user-provided SP the interactions need to be standardized. But the proposed interactions (by looking at the example) are very service specific. I.e. calling another service requires the SC to know the exact routes and parameters. It seems to me that having true interoperability between SCs and SPs is not possible with the proposed solution, or will require a huge amount of standardization...

This is why we took a more generic approach with ICRC-33: it allows calling services that the signer does not know and does not have a specific integration with.

In order for one service to integrate with another it would still need to somehow invoke icrc33_call_canister with the right arguments. Revisiting your older comment, I've found this:

Looking at your perfect scenario, this could absolutely be implemented as an ICRC-25 extension that defines the appropriate file methods. Note that generic canister calls have been split into ICRC-33.

Which makes me think that standardization for APIs is the proposed solution for generalization. But then why did you say that the standardization is a bad idea in the context of ICRC-35?

It seems to me that having true interoperability between SCs and SPs is not possible with the proposed solution, or will require a huge amount of standardization...

Webpage APIs can also be standardized. So all authentication providers would share same APIs, for example. Moreover, these APIs can be made a part of an asset-canister to become a part of the dapp's DAO-governed perimeter, so it is hard to change them without notice. Just describe it in a file and make a standard that would allow to only call the APIs, if they are in this file. Same way candid files work.

(By the way, about being able to do this UX flow in ICRC-25. How that would even work? Websites integrate with canister calls in ICRC-25, so the SC would have to render all of the integration-guiding screens itself, while in ICRC-35 each website renders its own screens, which is better, because they know about their own implementation details better than anyone else and they don't have to document all of it).

I see less and less pros in ICRC-25 & others. With each new revealed detail it seems more and more like ICRC-35 with extra steps and with unclear gain from these steps.

Currently, ICRC-29 is laser focused on the transport mechanism only (window post messages). Message content is out of scope and defined in ICRC-25 and its extensions. I like that model, because it allows to swap the transport mechanism independently of the message content.

What's your take on aligning ICRC-35 with others?

Is there a reason why ICRC-35 is scoped to window post messages, rather than a more abstract notion of a transport mechanism? There might be value in having other transport channels available for ICRC-35 too, no? For example app switch on mobile?

The deal with ICRC-35 is that it is not about the message content and how they are passed between parties. It is about the postMessage and webpage APIs. It doesn't make any sense with different transport channel, because the channel is the interoperability framework.

And moreover, you can integrate a lot more than just between websites with it. For example, MSQ is a MetaMask extension (an extension of another browser extension), but you can still integrate with it with ICRC-35, because MSQ has a webpage. If your mobile app also has a webpage, you can integrate with it easily - you would just need an additional channel to pass data between that webpage and the mobile app (for example, deep linking). Having postMessage as the only transport channel allows the protocol to be much more universal, than if it would allow to swap channels.

Looking forward to discuss this further in tomorrows working group session. I hope you'll attend!

I'm looking forward to that as well!

seniorjoinu commented 5 months ago

@frederikrothenberger Thanks a lot! I'm going to edit this PR so it aligns better with the file structure of the repo. Plus I need to modify the Hanshake phase, as we discussed.

Going to make it before the next WG call.

seniorjoinu commented 5 months ago

@frederikrothenberger I did all the updates I wanted. Confirm everything is fine and I can merge this PR.

Thanks!

seniorjoinu commented 5 months ago

@frederikrothenberger Oh, I thought I can push. But, Only those with [write access](https://docs.github.com/articles/what-are-the-different-access-permissions) to this repository can merge pull requests.