Add tooling/docs to easily reproduce Oasis Core builds

tjanez commented 4 years ago

SUMMARY

We should add the necessary tooling and document how to reproduce Oasis Core builds (at the moment, the oasis-node binary).

This is also a prerequisite to be able to verify an Oasis Core build before signing it (#2461).

ISSUE TYPE

Release process
Documentation

ADDITIONAL INFORMATION

Even though Go has great support for reproducible builds (version 1.13 added the -trimpath flag to the go build command), there are still a lot of steps that need to be taken before one is able to reproduce an Oasis Core build:

The used Go compiler must be the same.
The external dependencies (currently libseccomp for oasis-node) must be the same.
The git checkout from which the build is made, must be the same (we use git to infer the version based on previous tags and commits).

Probably the easiest way to achieve reproducible builds is to:

use a well-known Linux container as the basis of a reproducible build environment,
install a particular official Go release inside the container,
install distribution-provided release of libseccomp's headers inside the container,
bind-mount the git checkout into the container,
build a release,
prepare tarballs in a reproducible way.

kostko commented 4 years ago

This should be expanded to include reproducible runtime builds (probably the same environment should be used for both) to ensure that you can reproduce the exact MRENCLAVE.

tjanez commented 4 years ago

Probably the easiest way to achieve reproducible builds is to:

use a well-known Linux container as the basis of a reproducible build environment,

install a particular official Go release inside the container,

install distribution-provided release of libseccomp's headers inside the container,

After https://github.com/oasisprotocol/oasis-core/pull/2970, we will be rebuilding Docker images automatically, so it would make sense to create a minimal Docker image, e.g. oasisprotocol/oasis-core-build or oasisprotocol/oasis-core-reproducible, with just the necessary tools to be able to reproducibly build all release artifacts (without other development or testing stuff).

tjanez commented 4 years ago

Some more thoughts on the implementation:

create a minimal Docker image, e.g. oasisprotocol/oasis-core-build or oasisprotocol/oasis-core-reproducible, with just the necessary tools to be able to reproducibly build all release artifacts (without other development or testing stuff)
integrate the building of this Docker image with the existing GitHub Actions release workflow so that when a vX.Y version tag is pushed:
- the workflow first builds the oasisprotocol/oasis-core-build:X.Y image and pushes it to the registry,
- then bind-mounts the vX.Y git tag checkout into the container spawned from the oasisprotocol/oasis-core-build:X.Y image,
- and finally, builds the Oasis Core release artifacts via a Make target.

When documenting what people need to do, to reproduce our builds, it should be as easy as obtaining the appropriate oasisprotocol/oasis-core-build:X.Y image, checkout the vX.Y git tag and run a Make command.

kostko commented 3 years ago

There is now a documented way to build ParaTimes in a reproducible manner. The same builder environment could be reused for Oasis Core as well.

sbellem commented 2 years ago

@tjanez @kostko I have been working on something that may be of interest to you with regards to reproducible builds, using nix. (Related to: https://github.com/oasisprotocol/cipher-paratime/issues/3, https://github.com/oasisprotocol/oasis-sdk/issues/136#issuecomment-841427061; /cc @nhynes.)

`oasis-core-tools`

There's a current build file (called a nix flake) for oasis-core-tools under https://github.com/sbellem/oasis-core/blob/nix/flake.nix. You can see a ci-reproducibility check sample at https://github.com/sbellem/oasis-core/runs/5417928561. Notice that the expected hash (48d12f80ff734c944d5c44b639069325e8e6b986d9c16c5b5cbae8a3e1eee319) is hardcoded, as the build is expected to be reproducible regardless of when it is built, as long as the source code does not change. Hence the scheduled cron job should always match against this hardcoded hash.

For instance, with nix installed and with flake support, one could check the reproducibility locally by building oasis-core-tools, and checking the hash of the binary against the expected value.

nix build github:sbellem/oasis-core/nix

# check hash of cargo-elf2sgxs
echo "48d12f80ff734c944d5c44b639069325e8e6b986d9c16c5b5cbae8a3e1eee319 *result/bin/cargo-elf2sgxs" | \
            shasum --algorithm 512256 --binary --check --strict

`cipher-paratime`

Likewise, there's a nix flake for cipher-paratime at https://github.com/sbellem/cipher-paratime/blob/nix/flake.nix. An example of reproducibility check can be seen at https://github.com/sbellem/cipher-paratime/runs/5418635784. The provided flake (flake.nix) supports both sgx and non-sgx builds. The binaries can be built like so:

# with sgx support
nix build github:sbellem/cipher-paratime/nix#sgx

# check hash of cipher-paratime.sgxs
echo "547c506aee7c7ee53e85ce0842e840010979288c8350a0b9dc1d510d8b431abf *result/bin/cipher-paratime.sgxs" | \
            shasum -a 512256 -b -c --strict

# without sgx support
nix build github:sbellem/cipher-paratime/nix#nosgx

# check hash of cipher-paratime
echo "d4c77a86e0d1502313d70095f3badcd5a1499510f8131b0665dd2a93dbfd00b2 *result/bin/cipher-paratime" | \
            shasum --algorithm 512256 --binary --strict --check

There's a bit of documentation at https://github.com/sbellem/cipher-paratime/blob/nix/NIX_BUILD.md.

CURRENT PROBLEM: For some reason that currently escapes me, the resulting binaries, when built on GitHub CI infrastructure yield different hashes than the ones I obtained on my laptop. I'm still investigating why that is so.

`MRENCLAVE` & Remote Attestation Verification

The direction I wish to go with this work is to use the nix-based reproducible builds to verify MRENCLAVEs, in the context of remote attestation. I have yet to learn how the remote attestation flow works in the oasis network. For experimental purposes, I'd like to see whether the nix-based builds can faciliate the work of an external user and/or auditor to conveniently re-build enclaves and check their MRENCLAVEs against remote attestation reports.

kostko commented 2 years ago

CURRENT PROBLEM: For some reason that currently escapes me, the resulting binaries, when built on GitHub CI infrastructure yield different hashes than the ones I obtained on my laptop. I'm still investigating why that is so.

We are using a specific Docker container to build paratimes:

https://docs.oasis.dev/oasis-sdk/runtime/reproducibility/

This container is also used in CI when building images. You can take a look at the exact versions of all components that need to be used (unfortunately not all dependencies are currently pinned, but your approach may actually allow that):

https://github.com/oasisprotocol/oasis-sdk/blob/main/docker/runtime-builder/Dockerfile

We could consider updating the runtime-builder environment to use what you suggest. Thoughts?

Yawning commented 2 years ago

We could consider updating the runtime-builder environment to use what you suggest. Thoughts?

I'd rather not, unless our current method doesn't work (and as far as I know it does). Maybe if all this nix stuff was exiled to inside a docker container, it would be ok, though adding even more complexity and dependencies to the build environment doesn't feel right to me.

What are the benefits of this over what we have currently?

As a side note (and so I'm not just negative):

I have yet to learn how the remote attestation flow works in the oasis network.

At a high level, it is not that complicated. The runtime descriptor contains a list of allowed MRSIGNER/MRENCLAVE pairs and attestation quote statuses, on a per-runtime version basis, which is cross-checked with the AVR from IAS.

IMO there isn't really all that much to verify here. If the binary is identical, MRENCLAVE will be identical (by definition, since it is the SHA256 digest of the enclave binary). Since people doing this will need to get an official build anyway (because they need the signature), just doing a byte-for-byte comparison of the executable is sufficient. Maximum tinfoil hattery could retreive the runtime descriptor from the on-chain registry, and see if the MRENCLAVE matches what is published for that version.

sbellem commented 2 years ago

Thanks @kostko and @Yawning for the feedback! Sorry for the delay in getting back. I'll give you detailed answers a bit later, but for now, the gist, based on what I know, is that achieving reproducible builds (bit-for-bit) may be easier to achieve with nix than with docker (if possible). (As a side note: It's possible to run nix in a docker container.)

One of the challenges when using docker is that when re-building the image, you may pick up new versions of dependencies, or of some of the dependencies of some dependencies.

I'm not familiar with fortanix when it comes down to how the MRENCLAVE is computed. When using the linux-sgx C++ sdk, the MRENCLAVE does not correspond to the sha256sum, which seems to be the case for enclaves built with fortanix.

Are you interested in the case in which an auditing party would want to verify whether the official build corresponds to the claimed source code? Without being able to re-build from source, one is left with having to trust someone for the expected hash, unless I misunderstand something.

Thanks again for the feedback, and I can get back with more details.

kostko commented 2 years ago

Yes, that's what I meant when I said the current process that simply uses a Docker container is not ideal (eg. some deps aren't pinned).

My suggestion was to have this nix-based build inside a Docker container so that people who want to build don't need to deal with nix on their system.

Yawning commented 2 years ago

Yes, that's what I meant when I said the current process that simply uses a Docker container is not ideal (eg. some deps aren't pinned).

Why is this and our other build reproducability issues closed then? Are there bugs for fixing this and actually making things reproducible? Deterministic builds should be a p:0 issue....

My suggestion was to have this nix-based build inside a Docker container so that people who want to build don't need to deal with nix on their system.

After what happened the last time I had to use this, I want nix nowhere near my system, even if they claim that things have gotten better. In nix's defense, I also want docker/podman/npm nowhere near my system either.

Are you interested in the case in which an auditing party would want to verify whether the official build corresponds to the claimed source code? Without being able to re-build from source, one is left with having to trust someone for the expected hash, unless I misunderstand something.

I mean...

The point of the reproducible builds (and why I pushed for them since the start of the project) is so that anyone can verify official builds, by producing identical ones, since proving binary-equivalence is a substantially harder problem.

What I am saying is that, given that builds are reproducible and deterministic, there is no additional handling required for MRENCLAVE because regardless of what SDK is being used, if the executable that anyone can produce is byte-for-byte identical to the official one, MRENCLAVE will be the same.

Ultimately, as long as I never have to interact with it, and I don't need to install "Yet another package manager", I'm moderately indifferent about how deterministic builds are achieved. From skimming your branches, this appears to be the case.

kostko commented 2 years ago

Why is this and our other build reproducability issues closed then? Are there bugs for fixing this and actually making things reproducible?

This issue is still open BTW :-)

Runtime builds are reproducible down to MRENCLAVE if you use our specified builder Docker image and there are reproducibility CI tests that make sure this is the case.

But this doesn't mean things can't be improved. I would be interested to see how this nix-based approach would work. My understanding is that this would make it more robust regardless of how the builder image itself is built as long as it can run nix and then the nix-based build process will make sure that all dependencies (e.g. specifically referring to various system libraries) are pinned.

Yawning commented 2 years ago

Use vagrant to build the docker image correctly? My general impression is that "getting consistent docker images" is a long-solved problem. Unless I'm mistaken, having to grab (or build) a docker image, then initialize nix, and pull down dependencies is more complexity and more points of failure than just building the docker image correctly in the first place.

And a the point where you're building a docker image with nix inside already setup with all the dependencies, that's functionally equivalent to building a docker image with all the inner contents pinned, just with extra tooling...

But if this is the path of least resistance, then sure, as long as it remains something that only CI ever deals with, nix would be fine.

sbellem commented 2 years ago

Yes, that's what I meant when I said the current process that simply uses a Docker container is not ideal (eg. some deps aren't pinned).

This tool: https://github.com/Jille/dockpin may help.

According to this blog post Reproducible Builds and Docker Images:

The other major issue that affects Docker images is the pinning of packages pulled in by package managers. An awful lot of images are based on Alpine or Debian derivatives, and use apk or apt to install dependencies. These need to have a version specified as otherwise the package manager will just pull in the latest at the time of the build. But this isn’t quite enough: you also need to pin the version of any packages that they depend on, recursively. This means flattening the entire package hierarchy and installing all the packages explicitly and with pinned versions.

sbellem commented 2 years ago

My suggestion was to have this nix-based build inside a Docker container so that people who want to build don't need to deal with nix on their system.

Yes, that is possible, e.g.:

docker run --rm nixpkgs/nix-flakes /bin/bash -c \
    "nix build github:sbellem/oasis-sdk/nix#test-runtime-simple-consensus && sha256sum result/bin/*"

sbellem commented 2 years ago

To be clear, I don't have any preference towards any tool, and just thought of sharing this work with you in case it may be of some use to the project.

It is not clear to me though whether docker alone can provide the guarantees that may be needed in the context of remote attestation verification and/or audits, where the verifier re-builds the docker image at an arbitrary time. I don't know myself. This blog post On using Nix and Docker as deployment automation solutions: similarities and differences may be useful.

According to the article shared above, (Reproducible Builds and Docker Images), a substantial amount of extra work needs to be done to ensure that a Dockerfile yields reproducible builds. My understanding was that for bit-for-bit reproducibility nix was more reliable, but I'm not 100% sure about this.

oasisprotocol / oasis-core