near / mpc

37 stars 11 forks source link

🔷 [Epic] Multichain #326

Closed itegulov closed 1 week ago

itegulov commented 11 months ago

Description

Multichain is a total overhaul of mpc-recovery service that will move us away from aggregated signatures towards TSS. Multichain network will store users' private keys in a non-custodial way by splitting keys into key shares and distributing the pieces among multiple independent parties.

Let P be the number of trusted parties holding the key shares.

Reasons for transition are:

Roughly, the new flow is going to look like this:

  1. Developer D writes and deploys a smart contract C @ my_cool_app.near
  2. User U calls function foo on C
  3. foo makes a cross-contract call to multichain.near (a contract deployed by us) where they request to sign(payload), where payload is payload provided by C's internal logic
  4. P participants index ingoing call to multichain.near/sign and see the transaction above
  5. They follow the MPC cryptographic protocol and generate a signature S for the submitted payload
  6. One of the participants submits S to multichain.near/response
  7. D has an indexer that watches for multichain.near/response and once it sees the transaction above it registers S as the response to U's interaction with C

Resources

Overview of cryptography behind multichain: https://docs.google.com/document/d/1FKC9LvVyrEq6CiFYCnUFtQfvaDWddxzdtaEQb-_fq_s/edit#heading=h.we4ish11290u

This epic presumes that future work is happening on top of https://github.com/near/mpc-recovery/pull/313

### Cryptography
- [ ] https://github.com/near/mpc-recovery/issues/328
- [ ] https://github.com/near/mpc-recovery/issues/341
- [x] https://github.com/near/mpc-recovery/issues/385
- [ ] https://github.com/near/mpc-recovery/issues/384
- [ ] Ensure we can have multiple signing state machines progressing at the same time
- [ ] https://github.com/near/mpc-recovery/issues/456
- [ ] (Optional). Reshare and run at the same time to ensure the liveliness of the protocol (i.e. there is no need to abort ongoing sign requests)
- [ ] https://github.com/near/mpc-recovery/issues/352
- [ ] Occasionally reshare the key even when the participant set does not change. Should be done once in 24 hours - 1 week according to Michel (Security, not necessary for March release IMO)
- [ ] https://github.com/near/mpc-recovery/issues/386
- [x] Persistent Beaver triples and presignatures. Right now on restart a node will lose all of the triples/presignatures rendering them useless for other nodes. (it can complicate the protocl, introdcuse many edge cases)
- [ ] (Optional). Some mechanism on deciding who is messing with the protocol messages. It is impossible to tell whose message broke the protocol step in cait-sith, but in theory some karma system might help here (-1 for being a part of a set that failed to complete a protocol step). Note that there is no incentive to behave badly intentionally, so this is arguably very optional.
- [ ] https://github.com/near/mpc-recovery/issues/439
### Network
- [ ] https://github.com/near/mpc-recovery/issues/329
- [ ] https://github.com/near/mpc-recovery/issues/381
- [ ] https://github.com/near/mpc-recovery/issues/382
- [ ] https://github.com/near/mpc-recovery/issues/445
- [ ] https://github.com/near/mpc-recovery/issues/405
- [ ] Proper message queue for messages from various epochs and states. Might even make it persistent? Important that it does not leak memory
- [ ] Versionized networking, use protobuf for keeping tack of compatibility, make sure that each two adjacent versions are backwards-compatible
### Consensus
- [ ] https://github.com/near/mpc-recovery/issues/330
- [ ] https://github.com/near/mpc-recovery/issues/353
- [ ] Enforce minimum and maximum threshold (and hence the size of the set as it is 150% of the threshold presumably). Minimum is needed to prevent a small set overtaking the entire MPC. Maximum is needed to prevent performance issues.
- [ ] (Optional). Consider a backup network (suggested by Michel, no elaboration)
- [ ] (Optional). Decide if we need some sort of slashing mechanism for misbehaving nodes
- [ ] https://github.com/near/mpc-recovery/issues/389
- [ ] https://github.com/near/mpc-recovery/issues/424
- [ ] (Opt). Recognize that someone else has gave up on the protocol and restarted it. This probably makes sense for only specific types of protocols - generating, resharing (maybe something else?). Major issue here is that there might not be an easy way to distinguish new protocol messages from the old protocol messages. This means some one of the nodes might get confused that the protocol is still alive due to race conditions. One way to battle is to attach random GenerationId and ReshareId to all protocols.
- [ ] (Optional). Allow rolling back from `Resharing` to `Running` if something fatal happened during the resharing phase (e.g. one of the nodes is not participating and blocking us from making progress). Then the set of new joiners is wiped and everyone new have to sign up from scratch.
### API (no yield/resume option)
- [ ] https://github.com/near/mpc-recovery/issues/346
- [ ] (Optional). Migrate to standalone independent indexer for each multichain node. See #346 for
- [ ] Finish with Bowen's self-call improvement (see https://github.com/near/mpc-recovery/pull/401). It is a direct improvement over what we have right now. You are not forced to pay gas for self calls, you can always just index respond txs as it was with the current approach. But with it you get an option to pay up to 300TGas to potentially get a sequential response. This also simplifies integration tests as you don't have to run indexer there.
### API (yield/resume option)
- [ ] Wait until https://github.com/near/NEPs/pull/519 is done
- [ ] Rewrite MPC contract to use yield/resume
### Infra
- [ ] https://github.com/near/mpc-recovery/issues/383
- [ ] https://github.com/near/mpc-recovery/issues/426
- [ ] https://github.com/near/mpc-recovery/issues/427
- [ ] https://github.com/near/mpc-recovery/issues/435
- [ ] https://github.com/near/mpc-recovery/issues/434
- [x] https://github.com/near/mpc-recovery/issues/428
- [ ] https://github.com/near/mpc-recovery/issues/430
- [ ] Enforce a tracing style that we should use uniformly, we can base it off of [this](https://github.com/near/nearcore/blob/master/docs/practices/style.md#tracing)
- [ ] https://github.com/near/mpc-recovery/issues/327
- [ ] Add the ability to start local test env (simmilar to the one we had in old design)
- [ ] Make dev env index starting from a recent block height
- [ ] Keep track of the last block height processed by the indexer
- [ ] https://github.com/near/mpc-recovery/issues/425
- [ ] https://github.com/near/mpc-recovery/issues/419
- [ ] https://github.com/near/mpc-recovery/issues/420
- [ ] https://github.com/near/mpc-recovery/issues/431
### Low priority bugs/performance improvements
- [ ] https://github.com/near/mpc-recovery/issues/433
volovyks commented 1 week ago

stale