diem / diem

Diem’s mission is to build a trusted and innovative financial network that empowers people and businesses around the world.
https://diem.com
Apache License 2.0
16.71k stars 2.58k forks source link

A Minimal Trusted Computing Base (TCB) #7930

Closed JoshLind closed 3 years ago

JoshLind commented 3 years ago

A Minimal Trusted Computing Base (TCB)

Authors: Joshua Lind (@JoshLind), David Wong (@mimoo) Status: Draft

1. Goals of this Document:

2. Preliminary Reading:

TCB Overview

Securing TCBs

3. Assumptions and Validator Component Abstraction (VCA)

To reason about the Diem TCB, we first make several assumptions about validators and their components in a blockchain.

Assumptions about validators: Note: We consider it future work to challenge these assumptions (see the bottom of this document).

Assumed components in a validator: Next, we assume a simple validator component abstraction (VCA):

4. Security Formalization

In order to analyze the security benefits of a TCB, we propose the following (informal) security definitions:

Types of compromise:

Types of security impact:

The Adversary model: Consensus assumes that f validators are byzantine and colluding (i.e., completely compromised). We therefore consider the TCB interesting if it can still provide security properties when h additional compromises occur (shallow or deep). We consider two adversary models:

Types of Attacks: We consider three high-level types of attacks:

5. The Incremental TCB Straw man:

To begin reasoning about the TCB in Diem, we take a step by step approach to building a TCB based on the VCA above. For each step, we reason about the security guarantees of the design.

Step 1: TCB = { Consensus key }

To begin, we move only the consensus key into the TCB and propose that consensus asks the TCB to sign data (e.g., votes). Reasoning about security, we see:

Step 2: TCB = { Consensus key + Safety Rules }

To improve on step 1, we focus on hardening the validator against safety attacks. To do this, we partition consensus and move a subset of the consensus module into the TCB, labelled safety rules. Safety rules contains a set of verification constraints that when enforced by enough validators (>= 2f+1) prevent forks in the consensus protocol (see the Voting Rules in the Consensus specification). Reasoning about security, we now see:

Step 3: TCB = { Consensus Keys + Safety Rules + Execution }

To prevent attacks on correctness (as seen in step 2 above), we need to ensure that shallow compromises cannot enable voting on proposals that arbitrarily extend state. To achieve this, we observe that one can simply move the execution logic (including the Move VM) into the TCB. This will enforce correct execution of transactions. However, one still needs to ensure that execution extends the correct state. Here, one could move storage into the TCB. However, this is naive as it bloats the TCB. Instead, we observe that it is more beneficial to treat storage as untrusted and instead have execution keep track of valid state root hashes and update them within the TCB. We call this approach execution correctness.

We now reason about the security of this approach:

6. The Existing TCB (v1)

Today, execution correctness is still a work in progress and not part of the TCB. As such, shallow compromises defend against everything but correctness attacks (see step 2 of the TCB straw man). In this section, we take a look at various implementation details of the TCB as it stands today:

7. Proposal & Path Forward (TCB v2)

Based on the observations above, we outline the following design and implementation improvements for the TCB (v2):

Design Improvements:

Implementation Improvements:

8. Future Explorations for the TCB

The list below contains future explorations for the TCB. Each of these requires additional thought and analysis.

aching commented 3 years ago

Thanks for putting this together, it is really well thought out. A few comments:

What needs to be considered when securing a TCB

I would also add thoughts around considering adding more scrutiny to changes (e.g. more reviewers and/or a small set of highly knowledgeable reviewers)

The Adversary model:

Some insight to the reader on why h<=f and h>f would be helpful here - this has to do with BFT properties of 2f+1. I find Quorum to be intuitive, but Majority is somewhat strange since it might not be an actual majority. Naming is hard, but the terms are just \<Quorum and >=Quorum IIUC.

  1. The Incremental TCB Straw man:

Strawperson?

Move execution correctness into the TCB (or otherwise verify execution):

Even if we move toward a world where most validators are not re-executing transactions, I expect there still will be a few that do - possibly run by the Diem Association or its members internally to verify execution. In any case, if it provides a meaningful tradeoff of performance vs security, it is worth considering to just run 1 or more privately to detect these execution issues.

Given that TEEs/secure hardware are aways off, is there an intermediate recommendation? For instance, is there much value to the current implementation or should we integrate directly and then rethink upon a secure implementation?

Always export the consensus key to safety rules:

Is the performance gain significant? Leaking the consensus key seems worse than leaking control of the key.

Remove the execution correctness key and allow consensus & execution to communicate directly:

At first glance, this seems to conflict with "Move execution correctness into the TCB (or otherwise verify execution):". Is this an intermediate proposal as I mentioned earlier?

Analyze/Audit the interface between execution correctness & storage (ensure it is untrusted):

+1

Btw, any thoughts about marking this a markdown file in the repo itself? We could add a rationale folder in diem/documentation perhaps and it would make it easier to review the content here. =)

JoshLind commented 3 years ago

I've been informed that this is the wrong location for this issue and instead it should be added to the dip repository. Moving this there and closing the issue (https://github.com/diem/dip/issues/146).

@aching, I'll copy and paste your comments there when I can review and respond to them 😄