hdevalence commented 2 years ago

This issue is about designing the software architecture that will enable web-based wallets or other applications that interact with Penumbra.

Goals

We want to be able to support a wide variety of interfaces to Penumbra.

While it's important that there be at least one first-class wallet experience at launch, Penumbra's capabilities are multifaceted, and users will probably be best served by specialized interfaces: e.g., one for basic transfers or swaps, one for governance, one for power users to manage liquidity or examine the liquidity graph, etc.

Supporting a wide variety of interfaces is also important for decentralization: no one entity should "own" the Penumbra userbase via control of a single frontend, or be at risk of becoming a chokepoint for control of those users. This also means that it should be possible to build frontend interfaces to Penumbra that do not require custom backend infrastructure beyond an ordinary pd full node.

We want those interfaces to be as easy to build as interfaces to transparent chains.

Historically, interfaces to shielded chains have been more difficult to build than interfaces to transparent chains, because they require the application developer to manage synchronization of users' private state, unlike a transparent chain, where user state is accessible via RPC. Penumbra is about privacy without compromise, so we want to make it as easy to build those interfaces for Penumbra as it is for a shielded chain.

We want our users to have security when interacting with those interfaces.

In order for users to benefit from the availability of a wide variety of third-party interfaces, users need to be confident they can use them without risk of losing funds or being hacked. We need to ensure that only the user can authorize a transaction, and be able to understand exactly what actions they're authorizing when they do so.

Structure

We've already been prototyping a modular client architecture in pcli since April, so we have a good idea of how a Penumbra client decomposes into components with different levels of capability, and how those components interact with each other and with the network:

          ╭     ┌───────┐   custody
  spending│     │custody│   protocol
capability│     │service│◀─────────────────┐
          ╰     └───────┘                  │
                                           │
                                           │
                                           ▼
          ╭     ┌───────┐   view       ┌───────┐
   viewing│     │view   │   protocol   │wallet │
capability│     │service│◀────────────▶│ logic │
          ╰     └───────┘              └───────┘
                  ▲                      ▲ │
                  │                      │ │
                  ├──────────────────────┘ │
                  │client protocol         │tx
                  │(oblivious/specific)    │broadcast
                  │                        │                         .───.
                  │┌───────────────────────┘                       ,'     `.
          ╭   ┌───┼┼──────────────────────────────┐           .───;         :
    public│   │   ││             Penumbra Fullnode│          ;              │
     chain│   │   ││grpc/grpc-web                 │        .─┤              ├──.
      data│   │   ▼▼                              │      ,'                     `.
          │   │ ┌────┐ tm rpc proxy ┌──────────┐  │     ;               Penumbra  :
          │   │ │    │◀────────────▶│          │  │     :  ┌──────────▶ Network   ;
          │   │ │ pd │◀────────────▶│tendermint│◀─┼────────┘                     ╱
          │   │ └────┘   abci app   └──────────┘  │       `.     `.     `.     ,'
          ╰   └───────────────────────────────────┘         `───'  `───'  `───'

The client decomposes into three components: the custody service, which holds spending keys and is responsible for transaction authorization, the view service, which holds viewing keys and is responsible for syncing the private chain state, and the wallet logic itself, which queries:

the view service (using the gRPC view protocol) to learn about state like account balances;
the custody service (using the gRPC custody protocol) with requests to authorize a planned transaction;
a pd full node (using the oblivious or specific gRPC protocols) for public chain state, or a grpc proxy to Tendermint for transaction submission;
~~a tendermint instance (using Tendermint's JSON-RPC endpoint) to broadcast transactions.~~

~~(Later, we plan to simplify this in #1232, allowing clients to use gRPC to submit transactions, so they only have to communicate with a single pd endpoint).~~

How do these components fit into a web interface, given the goals above?

We need to isolate the spending keys, and thus the custody service, from any web content. This is important for two reasons: first, compromised or malicious web content should not be able to access spending keys, and second, the signing interface needs to display details on a secure path, so that compromised or malicious web content cannot phish users by pretending to sign one transaction while actually signing a different one.
The scanning and synchronization performed by the view service should ideally be done once per device, rather than being duplicated across every interface.

The first point requires that the custody service be implemented in a browser extension. Then spending keys would stay inside the extension, and the extension could display its own UI to the user to authorize signing, showing the details of the TransactionPlan. The second point suggests that it would be better if that extension also provided the view service. Then the extension could do scanning and synchronization once, save the resulting data, and allow web content to query it.

The resulting architecture looks like this:

              ┌───────────┐
              │ Extension │
          ╭   │ ┌───────┐ │ custody
  spending│   │ │custody│ │ protocol
capability│   │ │service│◀┼────────────────┐
          ╰   │ └───────┘ │                │
              │           │                │
              │           │          ┌─────┼───────────────────────┐
              │           │          │     ▼            Web Content│
          ╭   │ ┌───────┐ │ view     │ ┌───────┐                   │
   viewing│   │ │view   │ │ protocol │ │wallet │                   │
capability│   │ │service│◀┼──────────┼▶│ logic │                   │
          ╰   │ └───────┘ │          │ └───────┘                   │
              │   ▲       │          │   ▲ │                       │
              └───┼───────┘          └───┼─┼───────────────────────┘
                  ├──────────────────────┘ │
                  │client protocol         │tx
                  │(oblivious/specific)    │broadcast
                  │                        │                         .───.
                  │┌───────────────────────┘                       ,'     `.
          ╭   ┌───┼┼──────────────────────────────┐           .───;         :
    public│   │   ││             Penumbra Fullnode│          ;              │
     chain│   │   ││grpc/grpc-web                 │        .─┤              ├──.
      data│   │   ▼▼                              │      ,'                     `.
          │   │ ┌────┐ tm rpc proxy ┌──────────┐  │     ;               Penumbra  :
          │   │ │    │◀────────────▶│          │  │     :  ┌──────────▶ Network   ;
          │   │ │ pd │◀────────────▶│tendermint│◀─┼────────┘                     ╱
          │   │ └────┘   abci app   └──────────┘  │       `.     `.     `.     ,'
          ╰   └───────────────────────────────────┘         `───'  `───'  `───'

Design Questions

[x] How does the web content, or the browser extension, speak to a pd full node? What is required to use gRPC from inside of a browser, and what changes do we need to make on the pd side to enable web pages to make gRPC queries?
- gRPC can't be used directly from inside of web content, because it requires HTTP/2 features that browsers do not expose. grpc-web is a workaround that uses a special proxy. tonic-web is the Tonic library for grpc-web support. Can we make the pd gRPC endpoints speak both normal gRPC as well as grpc-web simultaneously?
- Yes: #1400 adds native support for grpc-web to pd.
[x] How does web content communicate with the browser extension? Can we use the existing gRPC view and custody protocols as-is somehow, or do we need to define a "browser version" of those protocols?
- We need to define a browser version of those protocols, but we may be able to reuse message types.
[x] How does the browser extension store state? In the native-Rust view service implementation, we use SQLite to store state. Do we need to use a browser storage API instead?
- Yes, we need to use native storage APIs.
[x] How much code reuse is possible between the existing native-Rust view service and the browser extension? It's possible to compile the Rust code that does cryptography or data structures to WASM, but it's probably not (?) possible to compile all of Tokio, so the extension might need to replicate the sync logic in typescript.
- Yes, the extension needs to replicate the sync logic.

hdevalence commented 2 years ago

Having both the view service and the custody service in the same browser extension could let us build a Penumbra-specific version of the "connect wallet" flow users in other ecosystems are accustomed to:

access to the view service could be granted by the user with a "Connect Wallet (Read-Only)" prompt;
access to the custody service would go through the extension's authorization UI, as normal.

Then, for instance, a DEX dashboard or a block explorer could request access to the user's view service, allowing it to query their private state.

In the future, we might also want to explore how invasive of a change it would be to build the wallet logic without access to any of the viewing keys (FVK/IVK/OVK), so that we could not even give viewing keys to the web content, and allow users to more meaningfully "revoke" access (whereas in the current threat model, the web content has access to viewing keys, so if it were malicious, it could compromise privacy).

mikedotexe commented 2 years ago

This past Saturday, a few interchain folks met up and we tossed around ideas about wallet architecture.

Notice how the 1Password browser extension causes OS X to pop up a prompt:

Kapture 2022-09-08 at 00 07 38

I believe they're using "deep linking" meaning it's using a custom protocol registered to the desktop app. A good example of this is Discord. In a browser's URL you can open the Discord web app with https://discord.com/channels/@me/123/456 but if you replace the https with discord, it will open in the desktop app.

Seems like 1Password has a good model to follow. A benefit of having a desktop app is that you can finally utilize Keychain Access, or whatever the operating system's security is. (This is probably where we should have been storing secrets this whole time, instead of in browser extensions.)

So how hard is it to create such a desktop app? Well, there's this thing called Tauri that is basically Electron (JavaScript » Desktop app) in Rust. https://tauri.app

You can see in their documentation they recommend a frontend framework called Vite, but I imagine you could write straight HTML + CSS and avoid using any npm packages, avoiding that supply chain attack. The backend is Rust, and I think we could use this crate to talk to the OS's security. https://crates.io/crates/keytar

Making a desktop wallet might not be terribly difficult if there are existing crates for signing, etc.

However, it looks like deep links are in progress for Tauri, but there are recent comments showing movement. https://github.com/tauri-apps/tauri/issues/323

In the meantime, since we probably want to use Rust if we can, there's this website showing the state of Rust GUI. https://www.areweguiyet.com and is shows that core-foundation has perhaps a lot of usage due to the download numbers. (Of course, this is assuming building only for OS X users.)

We could keep all the wallet logic in Rust even though the skin will be done in Swift. So that's an option while we wait (or help sponsor?) the Tauri folks regarding the deep link feature.

hdevalence commented 1 year ago

Closing this out as a completed design iteration.

penumbra-zone / penumbra

Architecture for Web Wallet/Interfaces #1373