I worked on translating Ergo smart contracts to Webassembly (Wasm) as part of Google Summer of Code (GSoC) 2020. You can find an overview of my project in PR #777 on the Ergo repo. TL;DR: Ergo adds a smart contract language frontend to the Q*cert compiler. I extend Q*cert with a proof-of-concept Webassembly backend.
This PR constitutes my GSoC project submission. It's meant to be frozen on the GSoC deadline on 8/31/2020. Future edits should go to Jerome's working PR #142.
In the following, I will try to wrap up some technical aspects of my implementation.
Overview
The Wasm specification contains a reference implementation in OCaml. I implement a translation from Imp(Ejson) to the Wasm reference Ast. We also hook up the reference interpreter as Q*Cert evaluation function for testing. All my code is OCaml and not verified. Jerome did some plumbing on the Coq side.
Preliminaries
We add the opam dependency wasm.1.0.1.
We switch to separate extraction to circumvent cyclic dependencies between Coq and OCaml.
Code generation and evaluation
Code generation and evaluation is implemented in /compiler/wasm/. The main component is the functor in wasm_backend.ml which is instantiated in wasm_ast.ml. The latter provides all functions necessary to extend Q*cert with a new Wasm_ast language target. The functorization is needed since Ergo's extracts its own Imp and OCaml cannot infer their equality.
The translation functions in wasm_backend.ml rely on an incomplete intermediate representation for Wasm (wasm_ir.mli). In Wasm, functions, types, and variables are addressed by their integer index within the module. The IR replaces these integers with OCaml variables for more convenient addressing. It also provides concise constructors for Wasm AST elements.
Another convenience module is wasm_util.ml. It contains a small lookup table implementation based on Hashtable which is used all over the place to get from OCaml values to integer indexes.
In contrast to Imp's block-scoped variables, Wasm has only function-scoped variables. The translation in wasm_backend.ml does not take this into account and relies on unique variable names. We pre-process the Imp using the translation in wasm_imp_scoping.ml in order to resolve this conflict (see #145).
I implement a binary encoding for EJson values in wasm_binary_ejson.ml. This encoding is used for storing constants in the linear memory of the generated Wasm module. It's also used at runtime for communication between the interpreter and the runtime module (see Ergo PR #777). I briefly describe the encoding in the header of source file.
Runtime
The generated wasm module relies on a Wasm runtime for execution (see Ergo PR #777). This runtime is implemented in Assemblyscript (NPM ecosystem) and lives in /runtimes/assemblyscript/. We build the runtime with npm run asbuild:untouched and run some tests with npm test. The Assemblyscript source resides in assembly/index.ts, the compiled Wasm in build/untouched.wasm.
We add a dune rule that encodes the binary runtime build/untouched.wasm into an wasm_runtime.ml file and bundles it into wasm_backend.ml.
Tests
We provide some rudimentary tests under /tests/imp_ejson/ which can be run all at once using the command make -C tests wasm-imp-spec. The tests evaluate pieces of Imp(Ejson) on a given input in order to produce the expected output. Then the Imp code gets translated to Wasm and executed with the same input on the Wasm reference interpreter. The resulting output is compared with the expected output (should be equal).
Issues
We do not support all Imp operators (#133).
It might be useful to verify some aspects of the new backend (#146).
We should better integrate the new tests with the existing tests.
After supporting all operators, we should run all existing tests on the Wasm backend.
We have to double check that the operators are implemented in a functional way (no mutation) (#148).
I worked on translating Ergo smart contracts to Webassembly (Wasm) as part of Google Summer of Code (GSoC) 2020. You can find an overview of my project in PR #777 on the Ergo repo. TL;DR: Ergo adds a smart contract language frontend to the Q*cert compiler. I extend Q*cert with a proof-of-concept Webassembly backend.
This PR constitutes my GSoC project submission. It's meant to be frozen on the GSoC deadline on 8/31/2020. Future edits should go to Jerome's working PR #142.
In the following, I will try to wrap up some technical aspects of my implementation.
Overview
The Wasm specification contains a reference implementation in OCaml. I implement a translation from Imp(Ejson) to the Wasm reference Ast. We also hook up the reference interpreter as Q*Cert evaluation function for testing. All my code is OCaml and not verified. Jerome did some plumbing on the Coq side.
Preliminaries
wasm.1.0.1
.Code generation and evaluation
Code generation and evaluation is implemented in
/compiler/wasm/
. The main component is the functor inwasm_backend.ml
which is instantiated inwasm_ast.ml
. The latter provides all functions necessary to extend Q*cert with a newWasm_ast
language target. The functorization is needed since Ergo's extracts its own Imp and OCaml cannot infer their equality.The translation functions in
wasm_backend.ml
rely on an incomplete intermediate representation for Wasm (wasm_ir.mli
). In Wasm, functions, types, and variables are addressed by their integer index within the module. The IR replaces these integers with OCaml variables for more convenient addressing. It also provides concise constructors for Wasm AST elements.Another convenience module is
wasm_util.ml
. It contains a small lookup table implementation based onHashtable
which is used all over the place to get from OCaml values to integer indexes.In contrast to Imp's block-scoped variables, Wasm has only function-scoped variables. The translation in
wasm_backend.ml
does not take this into account and relies on unique variable names. We pre-process the Imp using the translation inwasm_imp_scoping.ml
in order to resolve this conflict (see #145).I implement a binary encoding for EJson values in
wasm_binary_ejson.ml
. This encoding is used for storing constants in the linear memory of the generated Wasm module. It's also used at runtime for communication between the interpreter and the runtime module (see Ergo PR #777). I briefly describe the encoding in the header of source file.Runtime
The generated wasm module relies on a Wasm runtime for execution (see Ergo PR #777). This runtime is implemented in Assemblyscript (NPM ecosystem) and lives in
/runtimes/assemblyscript/
. We build the runtime withnpm run asbuild:untouched
and run some tests withnpm test
. The Assemblyscript source resides inassembly/index.ts
, the compiled Wasm inbuild/untouched.wasm
.We add a dune rule that encodes the binary runtime
build/untouched.wasm
into anwasm_runtime.ml
file and bundles it intowasm_backend.ml
.Tests
We provide some rudimentary tests under
/tests/imp_ejson/
which can be run all at once using the commandmake -C tests wasm-imp-spec
. The tests evaluate pieces of Imp(Ejson) on a given input in order to produce the expected output. Then the Imp code gets translated to Wasm and executed with the same input on the Wasm reference interpreter. The resulting output is compared with the expected output (should be equal).Issues