Difference in coverage and normal fuzz builds?

alexcrichton commented 3 years ago

For the Wasmtime project we recently added some build configuration to use opam to install an OCaml compiler inside of our fuzzer image. Our fuzzers seem to recognize this and are working with it, but over the weekend we got a notification that the coverage build was failing as well. The logs for our coverage build indicate that the build fails because it can't find ocaml, but it seems to successfully install ocaml earlier so I'm not sure why it failed out.

Are there known differences in the coverage/fuzz builds? (other than CFLAGS and such) For example are different users used or something like that? Or different directory configurations? I'm sure that this is an issue we should be fixing on our end in our configuration, but I'm not entirely sure what to do based on the logs so far.

cc @abrown

abrown commented 3 years ago

Yeah, I could use an explanation on how this build flow works--e.g.,, what are these "steps"?

When we build the Docker image in step 1 we install the right version of the OCaml tools:

Step #1:  ---> 25844024b4c6
Step #1: Step 3/10 : RUN curl -sL https://raw.githubusercontent.com/ocaml/opam/master/shell/install.sh -o install.sh &&   echo | sh install.sh &&   opam init --disable-sandboxing --yes &&   opam install ocamlbuild --yes
Step #1:  ---> Running in 62066e5d7761
Step #1: ## Downloading opam 2.1.0 for linux on x86_64...
Step #1: ## Downloaded.
...

But only later do we set up the environment (e.g. path) using opam env:

Starting Step #3
Step #3: Already have image: gcr.io/oss-fuzz/wasmtime
...
Step #3: ++ opam env
Step #3: [WARNING] Running as root is not recommended
Step #3: [ERROR] Opam has not been initialised, please run `opam init'
Step #3: + eval

Do we need to figure out what opam env is emitting and replicate that with ENV directives in the Dockerfile? Or is it possible that "step 3" is looking at an old Wasmtime image?

jonathanmetzman commented 3 years ago

Are there known differences in the coverage/fuzz builds? (other than CFLAGS and such) For example are different users used or something like that? Or different directory configurations? I'm sure that this is an issue we should be fixing on our end in our configuration, but I'm not entirely sure what to do based on the logs so far.

I can't think of anything off the top of my head. Sorry for this trouble. Did you try reproducing the build failure locally python infra/helper.py build_fuzzers wasmtime --sanitizer coverage ?

Yeah, I could use an explanation on how this build flow works--e.g.,, what are these "steps"?

These are just the "steps" our tooling passes to GCB to build your fuzzers. It would be something like this:

Build Image
Build Fuzzers with ASAN.
Bad build check fuzzers.
Upload fuzzers to GCS.

Usually you can reproduce the issue by running helper.py <failing build>.

You can probably figure the exact steps done by GCB by running infra/build/functions/build_and_run_coverage.py

Do we need to figure out what opam env is emitting and replicate that with ENV directives in the Dockerfile? Or is it possible that "step 3" is looking at an old Wasmtime image?

I doubt this is necessary. Step #3 I think is the step where the fuzzers are getting build as opposed to when the image is getting built. I don't think it's possible that the image is different. No image gets pulled between steps #1 (when the image gets built) and #3 (when it is used).

alexcrichton commented 3 years ago

I tested locally with python infra/helper.py build_fuzzers wasmtime --sanitizer coverage but the build passed for me. I'll see if I can dig more a bit later today

google / oss-fuzz

Difference in coverage and normal fuzz builds? #6226