penumbra-zone / penumbra

Penumbra is a fully private proof-of-stake network and decentralized exchange for the Cosmos ecosystem.
https://penumbra.zone
Apache License 2.0
388 stars 296 forks source link

Architecture for Web Wallet/Interfaces #1373

Closed hdevalence closed 1 year ago

hdevalence commented 2 years ago

This issue is about designing the software architecture that will enable web-based wallets or other applications that interact with Penumbra.

Goals

We want to be able to support a wide variety of interfaces to Penumbra.

While it's important that there be at least one first-class wallet experience at launch, Penumbra's capabilities are multifaceted, and users will probably be best served by specialized interfaces: e.g., one for basic transfers or swaps, one for governance, one for power users to manage liquidity or examine the liquidity graph, etc.

Supporting a wide variety of interfaces is also important for decentralization: no one entity should "own" the Penumbra userbase via control of a single frontend, or be at risk of becoming a chokepoint for control of those users. This also means that it should be possible to build frontend interfaces to Penumbra that do not require custom backend infrastructure beyond an ordinary pd full node.

We want those interfaces to be as easy to build as interfaces to transparent chains.

Historically, interfaces to shielded chains have been more difficult to build than interfaces to transparent chains, because they require the application developer to manage synchronization of users' private state, unlike a transparent chain, where user state is accessible via RPC. Penumbra is about privacy without compromise, so we want to make it as easy to build those interfaces for Penumbra as it is for a shielded chain.

We want our users to have security when interacting with those interfaces.

In order for users to benefit from the availability of a wide variety of third-party interfaces, users need to be confident they can use them without risk of losing funds or being hacked. We need to ensure that only the user can authorize a transaction, and be able to understand exactly what actions they're authorizing when they do so.

Structure

We've already been prototyping a modular client architecture in pcli since April, so we have a good idea of how a Penumbra client decomposes into components with different levels of capability, and how those components interact with each other and with the network:

          ╭     ┌───────┐   custody
  spending│     │custody│   protocol
capability│     │service│◀─────────────────┐
          ╰     └───────┘                  │
                                           │
                                           │
                                           ▼
          ╭     ┌───────┐   view       ┌───────┐
   viewing│     │view   │   protocol   │wallet │
capability│     │service│◀────────────▶│ logic │
          ╰     └───────┘              └───────┘
                  ▲                      ▲ │
                  │                      │ │
                  ├──────────────────────┘ │
                  │client protocol         │tx
                  │(oblivious/specific)    │broadcast
                  │                        │                         .───.
                  │┌───────────────────────┘                       ,'     `.
          ╭   ┌───┼┼──────────────────────────────┐           .───;         :
    public│   │   ││             Penumbra Fullnode│          ;              │
     chain│   │   ││grpc/grpc-web                 │        .─┤              ├──.
      data│   │   ▼▼                              │      ,'                     `.
          │   │ ┌────┐ tm rpc proxy ┌──────────┐  │     ;               Penumbra  :
          │   │ │    │◀────────────▶│          │  │     :  ┌──────────▶ Network   ;
          │   │ │ pd │◀────────────▶│tendermint│◀─┼────────┘                     ╱
          │   │ └────┘   abci app   └──────────┘  │       `.     `.     `.     ,'
          ╰   └───────────────────────────────────┘         `───'  `───'  `───'

The client decomposes into three components: the custody service, which holds spending keys and is responsible for transaction authorization, the view service, which holds viewing keys and is responsible for syncing the private chain state, and the wallet logic itself, which queries:

(Later, we plan to simplify this in #1232, allowing clients to use gRPC to submit transactions, so they only have to communicate with a single pd endpoint).

How do these components fit into a web interface, given the goals above?

The first point requires that the custody service be implemented in a browser extension. Then spending keys would stay inside the extension, and the extension could display its own UI to the user to authorize signing, showing the details of the TransactionPlan. The second point suggests that it would be better if that extension also provided the view service. Then the extension could do scanning and synchronization once, save the resulting data, and allow web content to query it.

The resulting architecture looks like this:

              ┌───────────┐
              │ Extension │
          ╭   │ ┌───────┐ │ custody
  spending│   │ │custody│ │ protocol
capability│   │ │service│◀┼────────────────┐
          ╰   │ └───────┘ │                │
              │           │                │
              │           │          ┌─────┼───────────────────────┐
              │           │          │     ▼            Web Content│
          ╭   │ ┌───────┐ │ view     │ ┌───────┐                   │
   viewing│   │ │view   │ │ protocol │ │wallet │                   │
capability│   │ │service│◀┼──────────┼▶│ logic │                   │
          ╰   │ └───────┘ │          │ └───────┘                   │
              │   ▲       │          │   ▲ │                       │
              └───┼───────┘          └───┼─┼───────────────────────┘
                  ├──────────────────────┘ │
                  │client protocol         │tx
                  │(oblivious/specific)    │broadcast
                  │                        │                         .───.
                  │┌───────────────────────┘                       ,'     `.
          ╭   ┌───┼┼──────────────────────────────┐           .───;         :
    public│   │   ││             Penumbra Fullnode│          ;              │
     chain│   │   ││grpc/grpc-web                 │        .─┤              ├──.
      data│   │   ▼▼                              │      ,'                     `.
          │   │ ┌────┐ tm rpc proxy ┌──────────┐  │     ;               Penumbra  :
          │   │ │    │◀────────────▶│          │  │     :  ┌──────────▶ Network   ;
          │   │ │ pd │◀────────────▶│tendermint│◀─┼────────┘                     ╱
          │   │ └────┘   abci app   └──────────┘  │       `.     `.     `.     ,'
          ╰   └───────────────────────────────────┘         `───'  `───'  `───'

Design Questions

hdevalence commented 2 years ago

Having both the view service and the custody service in the same browser extension could let us build a Penumbra-specific version of the "connect wallet" flow users in other ecosystems are accustomed to:

Then, for instance, a DEX dashboard or a block explorer could request access to the user's view service, allowing it to query their private state.

In the future, we might also want to explore how invasive of a change it would be to build the wallet logic without access to any of the viewing keys (FVK/IVK/OVK), so that we could not even give viewing keys to the web content, and allow users to more meaningfully "revoke" access (whereas in the current threat model, the web content has access to viewing keys, so if it were malicious, it could compromise privacy).

mikedotexe commented 2 years ago

This past Saturday, a few interchain folks met up and we tossed around ideas about wallet architecture.

Notice how the 1Password browser extension causes OS X to pop up a prompt:

Kapture 2022-09-08 at 00 07 38

I believe they're using "deep linking" meaning it's using a custom protocol registered to the desktop app. A good example of this is Discord. In a browser's URL you can open the Discord web app with https://discord.com/channels/@me/123/456 but if you replace the https with discord, it will open in the desktop app.

Seems like 1Password has a good model to follow. A benefit of having a desktop app is that you can finally utilize Keychain Access, or whatever the operating system's security is. (This is probably where we should have been storing secrets this whole time, instead of in browser extensions.)

So how hard is it to create such a desktop app? Well, there's this thing called Tauri that is basically Electron (JavaScript » Desktop app) in Rust. https://tauri.app

You can see in their documentation they recommend a frontend framework called Vite, but I imagine you could write straight HTML + CSS and avoid using any npm packages, avoiding that supply chain attack. The backend is Rust, and I think we could use this crate to talk to the OS's security. https://crates.io/crates/keytar

Making a desktop wallet might not be terribly difficult if there are existing crates for signing, etc.

However, it looks like deep links are in progress for Tauri, but there are recent comments showing movement. https://github.com/tauri-apps/tauri/issues/323

In the meantime, since we probably want to use Rust if we can, there's this website showing the state of Rust GUI. https://www.areweguiyet.com and is shows that core-foundation has perhaps a lot of usage due to the download numbers. (Of course, this is assuming building only for OS X users.)

We could keep all the wallet logic in Rust even though the skin will be done in Swift. So that's an option while we wait (or help sponsor?) the Tauri folks regarding the deep link feature.

hdevalence commented 1 year ago

Closing this out as a completed design iteration.