Table of Contents
Overview
1. Dafny
2. Semantics Example
Getting Started
Verifying Bytecode
Building the Code
1. Java target
  1. Test Generation
Contributing
Resources

Overview

In this project we develop the Dafny-EVM, a functional specification of the Ethereum Virtual Machine in Dafny.

This type of specification has several advantages:

it is programming-language agnostic and easily readable: it does not require any prior knowledge of a specific programming language, but rather defines the semantics of the EVM as functions and compositions thereof. Read more
it is executable: we can run EVM bytecode, and in effect we have an interpreter of EVM bytecode. Read more
it is verified. We guarantee that our EVM interpreter is free of runtime errors (e.g. division by zero, arithmetic under/overflow). Read more
it provides a usable API for formal verification of EVM bytecode. Read more

Developing this specification in Dafny allows us to apply formal reasoning to Smart Contracts at the EVM Bytecode level. For example, one can prove that certain key properties are maintained by the contract. We choose Dafny over other verification systems (e.g. Coq or Isabelle/HOL) because it is relatively accessible to someone without significant prior experience.

Our functional specification is executable, meaning that we can run bytecode using it and compare their output with existing clients (e.g. Geth). In particular, we are interested in comparing against the Ethereum Common Reference Tests and have made some progress towards this.

Dafny

Dafny supports automated software verification by leveraging the power of state-of-the-art automated theorem provers (e.g. with SMT solvers like Z3). This means Dafny can prove a program is correct with respect to its specification. To do this, Dafny requires the developer to provide annotations in the form of preconditions and postconditions where appropriate, and/or loop invariants as necessary.

Semantics Example

Our semantics is written as a state transformer of type State -> State.

As a simple example, consider the following specification given for the semantics of the ADD bytecode:

/**
 * Unsigned integer addition with modulo arithmetic.
 * @param   st  A state.
 * @returns     The state after executing an `ADD` or an `Error` state.
 */
function method Add(st: State): (st': State)
    requires st.IsExecuting() 
    ensures st'.OK? || st' == INVALID(STACK_UNDERFLOW)
    ensures st'.OK? <==> st.Operands() >= 2
    ensures st'.OK? ==> st'.Operands() == st.Operands() - 1
{
    if st.Operands() >= 2
    then
        var lhs := st.Peek(0) as int;
        var rhs := st.Peek(1) as int;
        var res := (lhs + rhs) % TWO_256;
        st.Pop().Pop().Push(res as u256).Next()
    else
        State.INVALID(STACK_UNDERFLOW)
}

This tells us that ADD requires two operands on the stack to be performed, otherwise, the exceptional state INVALID(STACK_UNDERFLOW) state is reached.
When more than two operands are on the stack, addition employs modulo arithmetic (hence, overflows wrap around) and the final result (of the addition modulo) is pushed onto the stack after the operands are popped, and then the program counter is advanced by 1.

The postcondition ensures st'.OK? <==> st.Operands() >= 2 specifies a strong guarantee on the code in the body of function: for any input state st, Add returns an OK state (non-failure) if and only if the stack in the input state st has at least two elements (Operands()). Note that this postcondition is checked by the Dafny verification engine at compile-time not at runtime.

Getting Started

To use our code base you may follow these steps:

Install a recent version of Dafny. We recommend installing the VSCode Dafny extension as it bundles the editor interface (syntax colouring, error reporting, etc) and the Dafny compiler code.
Clone this repository.
Build the code (see below) or start with this introductory material.

Verifying Bytecode

Our EVM is written in Dafny. As a result we can instrument bytecode with some reasoning features. Some examples are given in the verification examples document.

Building the Code

Dafny has to be translated to a target language to be run and to access functionality not included natively in Dafny. We have currently 2 target languages: Go and Java.

Java target

The Java target uses the gradle build system. To build the code, you need the following components:

Java 11 (or greater)
Dafny 3.10 (or greater).
Gradle 7 (or greater)

With these installed, you can build the code using the following command:

> gradle build

This will verify the codebase using Dafny along with some examples, generate a Java implementation of the EVM, and run two test suites against it in Java.

Test Generation

As the main purpose of our EVM is to reason about bytecode, we may want to have some guarantees that the proofs we develop are also valid on other EVM implementations: if the same code is run on another implementation then the guarantees (e.g. no stack under/overflow) that we obtain using our automated reasoning and our EVM are still valid. This requires to prove that the other implementation produces exactly the same computations as our EVM on all inputs and for all programs. It is not practical to formally prove this kind of equivalence.

However we can compare the results of the execution of some bytecode on different implementations. If for a large number of tests two implementations give the same results (sequences of states), we have some confidence that the two implementations are equivalent. If our EVM yields the same results as, say the Geth's evm tools, then we can be confident that our proofs on the bytecode should be valid on the Geth EVM too.

The test cases used for the Dafny EVM are stored in the tests/ directory. These are generated from the Ethereum Consensus Tests using Geth's evm tool. Each test is a json file similar in structure to that used by the Ethereum Consensus Tests, except that they include full trace data (i.e. the state of the EVM after every execution step).

To regenerate the trace tests, you need to ensure the fixtures submodule is updated appropriately. If you originally employed git clone --recursive when cloning the repository, then you don't need to do anything. Otherwise, you can do this:

git submodule update --init

Using gradle one can now regenerate all the trace tests as follows:

> gradle testgen

This can take several minutes to complete, and requires that Geth's evm tool is installed and visible on PATH (we currently recommend version 1.10.25 or later). Furthermore, the test generation process is governed by the files tests/includes.txt and tests/excludes.txt. The former determines which of the reference tests should be included, whilst the latter identifies specific cases to exclude. Finally, the trace generation process is managed by the EvmTools framework.

Contributing

See the CONTRIBUTORS file for more information on contributing to this repository. By default contributors accept the terms of the license. We also endeavour to follow the conventions of the Dafny style guide.

Resources

The architecture of our Dafny-EVM: ARCHITECTURE.md

Some useful links:

the Berlin version of the yellow paper (YP)
An complete Introduction to the EVM, Ethereum foundation
A Tutorial on the YP specification, (Ethereum foundation)
the K-framework EVM semantics KEVM, jellopaper
the Main EVM semantics in Jello paper (K framework)
A Quick reference to EVM opcodes
An Interactive reference to EVM opcodes
The Yul intermediate representation Yul documentation
Another proposal Yul+

Consensys / evm-dafny

readme

Table of Contents