CosmWasm / wasmd

Basic cosmos-sdk app with web assembly smart contracts
Other
368 stars 400 forks source link

How to work with source and builder? #77

Closed webmaster128 closed 4 years ago

webmaster128 commented 4 years ago

A code's source and builder are "a valid URI reference to the contract's source code" and "a docker tag", but what does that mean semantically? What's the purpose? How are they used?

Some issues I see here are:

  1. As long as it is not possible to flatten the contract into a single file in a standardized way, this cannot be an URI that is machine processible. A link like https://github.com/confio/cosmwasm-examples/tree/f87f7a3d6d7ec1abd77a4b4bc2388799a4fe4c26/erc20 is nice for human consumption, but does not allow for reproducible builds (assuming we're not planning to restrict to one code hoster). You could expect an archive (.tar/.zip) URL. But is that the way forward?
  2. A docker tag does not tell (a) the docker hoster, (b) the repository user
  3. A docker tag does not tell how to invoke a reproducible build. This will probably be slightly different when other optimizers than cosmwasm-opt are used, e.g. when compiling a different language

So right now the fields don't seem to be sufficiently compete for machine consumption. Are they just hints to the user? If so, why is there a builder format check?


Side note: source is documented to be an URI but checked to be an absolute URL.

ethanfrey commented 4 years ago

These definitely need to evolve somehow, along with tools to consume them. Here are my current ideas, for Rust code. For other languages, we would need other conventions. And we should provide a verifier program that can process a number of conventions, leaving the format as an agreement between the code uploader and the verifying agent, where the blockchain only enforces some sanity checks.

For Rust, I would propose:

  1. source is always a valid crates.io identifier, including version, like https://crates.io/crates/cw-erc20/0.1.0 You cannot modify these, so this becomes an immutable identifier (modulo hack at crates.io). There are scripts that will download the source code given a package name and version, like cargo-download
  2. builder should always be a reference to a docker image. For the Rust verifier, we can assert that it is cosmwasm-opt:x.y.z, where x.y.z > 0.6.0 or something like that. As different versions of cosmwasm-opt may produce different wasm binaries (newer version of rustc), it is important to allow this to be specified.

In the end, the idea was to provide some metadata about a user-provided claim of the source code behind the wasm and enough info that someone could verify it. Of course, this only works if both sides speak the same conventions, and know the docker image. My idea was to script a workflow that took CodeInfo, downloaded source via cargo-download or such, then ran docker run cosmwasm-opt [lot of flags] and check the output of hash.txt. If that is the same as the hash in the CodeInfo, then we can validate that this is the source code behind the wasm. If it is different, we still have no idea of the source code.

Either everyone can do this themselves, or we can have some sort of a way for people to publish claims that yes, they validated this is the source code, and if sufficient trusted people say so, we can just go to browse the source. Ideally with some web UI for this. As to making such claims of the validity of the hash, my idea was to reuse the cargo-crev code review tooling that I promote in the docs.

The above is my current thoughts, this should definitely be refined and documented. Then we can work on some tooling for it

ethanfrey commented 4 years ago

As to what is enforced on the server...

BuildTagRegex = "^cosmwasm-opt:"

I requested a builder tag check as valid docker image, but got this (which was good enough for now, so I merged it). For a correct solution, we would investigate and test out a better docker regexp. I am not sure what the proper regexp would be, but found this suggestion.

Side note: source is documented to be an URI but checked to be an absolute URL.

Yes, it should be an absolute URL (docs need improvement), and we could enforce the protocol to be https

webmaster128 commented 4 years ago

source is always a valid crates.io identifier, including version, like https://crates.io/crates/cw-erc20/0.1.0 You cannot modify these, so this becomes an immutable identifier (modulo hack at crates.io). There are scripts that will download the source code given a package name and version, like cargo-download

This can be generalized to optionally compressed tar archives, which you can download and extract from crates.io with

curl --location -sS https://crates.io/api/v1/crates/cw-erc20/0.1.0/download | tar -x

Or for GitHub (no subfolder links supported, only full repo)

curl --location -sS https://github.com/confio/cosmwasm-template/tarball/110818f | tar -x
ethanfrey commented 4 years ago

What is left is documenting this better somewhere. Code is cleaner

webmaster128 commented 4 years ago

I sucessfully verified a cw-nameservice .wasm file against its source in a semi-automated way with

#!/bin/bash
set -o errexit -o nounset -o pipefail
command -v shellcheck > /dev/null && shellcheck "$0"

SOURCE_URL="https://crates.io/api/v1/crates/cw-nameservice/0.1.0/download"
BUILDER_IMAGE="confio/cosmwasm-opt:0.6.2"

TMP_DIR=$(mktemp -d "${TMPDIR:-/tmp}/cosmwasm_verify.XXXXXXXXX")

(
  echo "Navigating into working directory $TMP_DIR ..."
  cd "$TMP_DIR"

  echo "Downloading and extracting $SOURCE_URL ..."
  wget -O - "$SOURCE_URL" | tar -x --strip-components 1

  echo "Files in working directory:"
  ls .

  CACHE_KEY="cosmwasm_verify_cache_$(echo "$SOURCE_URL" | xxd -p -c 999999)"
  docker run --rm \
    -v "$(pwd):/code" \
    --mount type=volume,source="$CACHE_KEY",target=/code/target \
    --mount type=volume,source=registry_cache,target=/usr/local/cargo/registry \
    "$BUILDER_IMAGE"

  cat hash.txt
)

With the server-side changes we have right now, this works for any

  1. URL pointing to an optionally compressed .tar
  2. Dockerhub image
webmaster128 commented 4 years ago

This is done and tested. A builder agnostic verifier is here: https://github.com/confio/cosmwasm-verify, including a bunch of documentation about usage and conventions.

ethanfrey commented 4 years ago

Very cool. That repo serves as living docs of how to use those fields