brave / star-randsrv

Go wrapper service for the STAR randomness server.
Mozilla Public License 2.0
7 stars 5 forks source link

Support key persistence #175

Open rillian opened 1 year ago

rillian commented 1 year ago

For some applications, it would helpful to persist the OPRF key across restarts, or clone it among a cluster of instances. Implementing this is somewhat sensitive, since the whole point of the PPOPRF is to keep the private key private. Currently the ppoprf crate doesn't expose the private key.

I suggest the following design:

Terminating the application after generating the key separates the step from normal invocation, making it easier to keep the key material out of logs. Likewise with reading a existing key from the environment, rather than a command-line argument.

The shared key will be unpunctured. Passing the correct epoch synchronization arguments will take care of puncturing no-longer valid epochs as they would with a random key.

To implement, we will also need to extend the ppoprf crate with something like the following interface:

pub struct ServerPrivateKey(RistrettoScalar)

impl Server {
  pub fn get_private_key(&self) -> ServerPrivateKey {
    ...
  }
  pub fn from_private_key(&ServerPrivateKey) -> Self {
    ...
  }
}
DJAndries commented 1 year ago

Add a new command-line switch so star-randsrv --generate-key will create a ppoprf::Server and dump the private key to stdout, then terminate. At startup, look for a STAR_RANDSRV_PRIVATE_KEY env variable, and if set, use that key to construct the OPRFServer state instead of a random one.

Any reason why we can't have a private endpoint like /generate or something similar?

Terminating the application after generating the key separates the step from normal invocation, making it easier to keep the key material out of logs.

Is this why? Is there a reason why we can't simply omit the key information from the logs? When using the info log level, I don't see response payloads in the logs. Even if it did emit key information, would it matter if we are running the server in an enclave, and can't see the logs anyway?

rillian commented 1 year ago

It's not just logs. Having a private http server is error prone; one routing mistake and you're publishing your private keys on the internet. Writing keys to stdout and reading them from env is safer.

@kdenhartog do you have an preference on this design element?

DJAndries commented 1 year ago

Having a private http server is error prone; one routing mistake and you're publishing your private keys on the internet.

What if we had a separate listener on some other port? That way we can have two separate routers to greatly reduce the chance of that happening.

DJAndries commented 1 year ago

Just reviewed the key sync doc. Couldn't we just generate the key and send it to /enclave/state in nitriding, all in a single process?

kdenhartog commented 1 year ago

I prefer ENV vars on startup over private endpoints because the only time we need to be setting this key would be at start and from there the service can perform the rotation. So, in theory we reduce the amount of possible side effects from this endpoint getting called maliciously (if someone gets on the network) if we go with the startup ENV variable approach.

Is there a reason where we'd want to update this key other than during start up service for this?

rillian commented 1 year ago

We do need to do key rotation in this application. The PPOPRF key we're using is good for a certain number (256) of randomness epochs. When those are exhausted the server can't continue without a new key.

For an isolated server instance we can generate new keys, and proceed from there. That is what the current code does. When the keying is controlled by some outside process (reloading with persistent state, propagating shared state across a cluster) we need a way to handle replacement at expiry.

One approach is to have star-randsrv terminate on key exhaustion and the outer framework can restart with new material. But a private endpoint to poke in new keys is another way to handle that.

kdenhartog commented 1 year ago

I was thinking it would be a more like each node deterministically generates it's next key instead of syncing between the different nodes. Would that be another possible option here?

rillian commented 1 year ago

@claucece said it was only safe to do that a few times.

That would still give us over a year of service even for daily epochs, which in practice may be longer than our kubernetes deployment would stay up, so stretching the key for 3-4 rotations then terminating to force a restart might work.

FWIW, our current thinking on state transfer with the nitriding proxy/sync daemon is to have star-randsrv (or a wrapper) pull new keys, rather than the nitriding sync daemon pushing. In either case, we can limit updates to times when star-randsrv needs new key material.

kdenhartog commented 1 year ago

Hmm, in thinking about this a bit further in relation to the other key syncing issue I'm noticing there's a broader pattern of needing to sync arbitrary data (secrets and non-secrets I suspect in the future) between different nitro enclave pods. I'm starting to backtrack on my original thinking and going down the path of trying to figure out how we generically solve data syncing between enclaves. On first though using the shim seems like a useful way to do this and handle the authorization between the various pods (I like our key sync idea of using the container image). Then have the shim pass the data into the enclave which can register an arbitrary handler with the shim to validate the data passed in from the shim. In this case, having these internal endpoints open is probably the way to go, but we need some way to maintain at least integrity and confidentiality guarantees between the pods.

WDYT?

rillian commented 1 year ago

Yes, there are definitely two levels here. For the nitriding framework supporting execution within the enclave, we want a general solution for synchronizing configuration from both external and internal sources.

For the purposes of this repo, star-randsrv just needs a way to generate and ingest ppoprf keys. That could be an optional component that talks to the nitriding proxy, but it could also be something more general like the ENV-based proposal above that's either driven by the nitriding daemon directly, or through some shim script.

I'd like the maintain some separation between the two applications since the randomness server is still useful outside a secure enclave.

kdenhartog commented 1 year ago

I'd like the maintain some separation between the two applications since the randomness server is still useful outside a secure enclave.

SGTM