boltlabs-inc / zeekoe

Zero-knowledge layer-2 payment channels
MIT License
24 stars 1 forks source link

Build customerd architecture with an RPC interface #347

Open marsella opened 2 years ago

marsella commented 2 years ago

Motivation

The customer architecture was originally created to be completely ephemeral: it would execute an operation (like a payment) and then cease to exist. However, this was an incorrect assumption. The customer needs a chain watcher to be running at any time that it has an open channel, to monitor for closing behaviors.

We added a long-running watcher process, but this is not tightly linked to other customer processes. For example, it's possible to create a new channel without first starting the watcher, which is a protocol violation and can result in loss of funds #241. Instead, there should be a customer daemon, without which no ephemeral customer operations can be executed.

Goal

customerd is a long-running daemon process that runs in the background. It holds a database connection, maintains a chain watcher, and executes a queue of customer operations as they are requested.

Customer operations are initiated on the command line. They cannot run if the customerd is not initialized. They communicate with customerd using an RPC protocol.

The RPC protocol itself is a specification that describes the ways other processes can communicate with the daemon. Each current command line operation will have a corresponding well-defined request. These will need to be documented, probably in zkchanels-spec. There will be an additional command to kill the daemon and shut down the chain watcher.

Advantages of this approach include

Existing work

There's a (commented-out) daemon architecture in the customer watcher. This was designed to be a ping-only daemon that would check the chain when it got a ping from another function. The ping interaction is a Dialectic protocol. It sets up a Server, broadcasts on localhost, and refreshes the watcher on receiving a request.

I think the reusable part of this architecture is the Server. It was refactored in #146 to remove the TLS requirement, so requests can come in on the local network (the network options are now either TLS or TCP).

The server is parameterized by a dialectic protocol.

Next steps

marsella commented 2 years ago

TCP or TCP+TLS?

TCP is a communication protocol that describes how to move information from point A to point B. TLS is an encryption protocol. There's no impediment to using TLS with JSON-RPC. However, TLS requires valid certificates approved by a certificate authority.

Most processes would be accessing the daemon from the same machine, so it's a bit of a weird trust model to require external validation of your local daemon -- if an attacker can impersonate and run a daemon on your local machine, you probably have bigger problems.

The tradeoffs are less clear to me for processes that access the demon from a different machine on the same LAN, or for a scenario where we configure the daemon to be accessible from outside the LAN. I am willing to make the assumption that such access should not be possible, and use unencrypted communication (TCP only) for the daemon for now.

marsella commented 2 years ago

JSON-RPC Dialectic protocol

The JSON-RPC spec is straightforward: the client must send a Request. The server must respond with a Response, or with nothing if the Request is a notification type.

Then the dialectic protocol is probably going to be

  1. Customer chooses from two options (Request or Notification)
  2. If Notification, the customer sends a Notification
  3. If Request, the customer sends a Request and receives a Response
  4. Close the connection

The response will either have a result or an error, but I think it makes sense to encode this locally (like, parse a Response type to get an Result<RpcResult, RpcError>) rather than as a second choice in the RPC network protocol.

I am not sure that this will be compatible with standard (non-Dialectic) RPC servers. The response-parsing is, but the initial choice is not. I think this is the issue is already raised in Dialectic. However, I am willing to make this compromise in order to not re-write the server from scratch. It may be the case that our spec will not use any Notifications, in which case we can be fully compatible.

marsella commented 2 years ago

Queuing protocols

With the current chain-watching infrastructure, we want the daemon to check the chain every 1 minute, and we want it to process other requests sent in via RPC. Eventually, the chain watcher should get updated to a notification service, where the daemon should receive push notifications and react to them as they arrive.

In general, requests cannot be parallelized (e.g. you can only execute one payment at a time). So the request queue should just be a normal queue, maybe holding futures, and then we can await them in order. When requests come in, they go at the back of the queue. When chain-watching steps come in, they go at the front. This could still cause chain-watching steps to get highly delayed, though, if e.g. the previous thing on the queue involves posting to chain and it takes 20 minutes.

Instead, the queue should just run as a separate task, like the merchant server. In this architecture, we spawn two tasks, one with a running server and one with a looping polling service. Both of these are expected to run forever, unless they encounter an error. If one of them raises a fatal error or if the server gets a "kill" request, they both shut down (in particular, the function ends without waiting for the uncompleted task to terminate).

With this architecture, we don't need any kind of special queue algorithm. It's just a standard FIFO queue.

marsella commented 2 years ago

Server

Upon further reflection, I think we can't use our Server code until Dialectic fixes the self-documenting choice issue. We need to be able to reject Notifications (even if they aren't allowed in our protocol) by closing the channel, but Dialectic won't allow that without another choice message. A normal RPC client won't know what to do with that choice.

Next step: determine what the server in json-rpc provides. Look at other options for Rust JSON-RPC libraries and see what they provide. Edit this comment.