google / fuzzbench

FuzzBench - Fuzzer benchmarking as a service.
https://google.github.io/fuzzbench/
Apache License 2.0
1.11k stars 269 forks source link

Archive coverage data alongside corpus archives (from AFL++ fork) #2028

Closed addisoncrump closed 1 month ago

addisoncrump commented 3 months ago

Supercedes #2020. Moving so we (AFL++ people) can collaborate on this PR.

From the original:

Currently, only corpora are saved in the archive and the summaries of coverage are provided at the end of the experiment. This change simply incorporates the saving of the coverage data snapshots next to the trial corpus snapshots.

addisoncrump commented 3 months ago

@DonggeLiu Can we try to do a baseline experiment with this PR again? :slightly_smiling_face: It is fully rebased to the latest changes.

I will integrate the analysis changes once there is a public baseline to point the analysis example at.

DonggeLiu commented 3 months ago

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-08-17-2028-bases-1 --fuzzers afl aflplusplus libafl libfuzzer

DonggeLiu commented 3 months ago

Hi @addisoncrump,I started a test exp above. Experiment 2024-08-17-2028-bases-1 data and results will be available later at: The experiment data. The experiment report. The experiment report(experimental).

If it works well and you'd like to run a full exp (23 hours), could you please rebase to adopt this change? I forgot to revert temp changes in a previous PR.

Thanks!

addisoncrump commented 3 months ago

Rebased. The experiment looks good, all the coverage samples were archived.

DonggeLiu commented 3 months ago

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-08-19-2028-bases-1 --fuzzers afl aflplusplus libafl libfuzzer

DonggeLiu commented 3 months ago

Experiment 2024-08-19-2028-bases-1 data and results will be available later at: The experiment data. The experiment report. The experiment report(experimental).

addisoncrump commented 3 months ago

It seems to still not be hitting the measurer...

DonggeLiu commented 3 months ago

This is a really strange, because 2024-08-19-2028-bases-1 has a list of errors about merging coverage summary: image

But 2024-08-17-2028-bases-1 did not have any: image

DonggeLiu commented 3 months ago

QQ: Is the only thing change between those 2 experiments? image

BTW, I noticed this runtime crash in libafl. I don't think it could cause the failure, but it might be interesting to you: https://storage.googleapis.com/fuzzbench-data/index.html?prefix=2024-08-19-2028-bases-1/experiment-folders/libxml2_xml-libafl/trial-3070882/results/

It did not happen in 2024-08-17-2028-bases-1, maybe because that experiment was very short?

A possible theory: libafl saved some input into its corpus during this crash, which caused measurement failure?

addisoncrump commented 3 months ago

@tokatoka random libafl crash :upside_down_face:

addisoncrump commented 3 months ago

I confirmed that the only difference is that commit, yes.

Let's add some more debugging and run a very short run with all the benchmarks, I guess?

tokatoka commented 3 months ago

ohh i see. so this is why my experiment didn't complete either

tokatoka commented 3 months ago

A possible theory: libafl saved some input into its corpus during this crash, which caused measurement failure?

but it should not affect other fuzzers such as aflplusplus runs right?

tokatoka commented 3 months ago

@tokatoka random libafl crash 🙃

can you reproduce? i used the same setup on fuzzbench but cannot reproduce

tokatoka commented 3 months ago

I updated. @DonggeLiu Could you run the same command again to see if it fixes the problem or not?

DonggeLiu commented 2 months ago

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-08-22-2028-bases-1 --fuzzers libafl

DonggeLiu commented 2 months ago

A possible theory: libafl saved some input into its corpus during this crash, which caused measurement failure?

Also running an experiment without libafl to help verify this theory.

DonggeLiu commented 2 months ago

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-08-22-2028-bases-2 --fuzzers afl aflplusplus libfuzzer

DonggeLiu commented 2 months ago

Experiment 2024-08-22-2028-bases-1 data and results will be available later at: The experiment data. The experiment report. The experiment report(experimental).

Experiment 2024-08-22-2028-bases-2 data and results will be available later at: The experiment data. The experiment report. The experiment report(experimental).

addisoncrump commented 2 months ago

bases-1 seems to be working fine, but bases-2 is not hitting the measurer still.

tokatoka commented 2 months ago

so it looks like the libafl crash is not the cause of this

tokatoka commented 2 months ago

btw for base-1 it seems all the fuzzers are stuck after 10:45m so it was not a successful run either...

DonggeLiu commented 2 months ago

Ops there were a DB issue yesterday which affected both experiments. Let me re-run them

DonggeLiu commented 2 months ago

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-08-23-2028-libafl --fuzzers libafl

DonggeLiu commented 2 months ago

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-08-23-2028-bases --fuzzers afl aflplusplus libfuzzer

tokatoka commented 2 months ago

we have experiment-folder but not report. so the measurement is still broken

tokatoka commented 2 months ago

For the report: 2024-08-23-2028-libafl and 2024-08-23-2028-bases is missing but 2024-08-23-2036-bases-1 (from the other PR) is there

For the experiment-data: nothing is missing.

tokatoka commented 2 months ago

Btw if the experiment on my branch and https://www.fuzzbench.com/reports/experimental/2024-08-23-dgfuzz/index.html ← this experiment are working. would it be possible that the changes in this PR caused the measurement failure??

DonggeLiu commented 2 months ago

Experiment 2024-08-23-2028-libafl data and results will be available later at: The experiment data. The experiment report. The experiment report(experimental).

Experiment 2024-08-23-2028-bases data and results will be available later at: The experiment data. The experiment report. The experiment report(experimental).

DonggeLiu commented 2 months ago

@addisoncrump would this happen to be related to this PR?

image

It could be due to this error:

image

I think the gsutil rm error is at least benign, because @tokatoka shows 2024-08-23-2036-bases-1 can generate a report:

Btw if the experiment on my branch and https://www.fuzzbench.com/reports/experimental/2024-08-23-dgfuzz/index.html ← this experiment are working. would it be possible that the changes in this PR caused the measurement failure??

and it also has gsutil rm error, but not llvm-profdata error:

image image

There was another build error (discussed in #2038, as shown above), but I am sure that one is benign and unrelated to the missing report in the experiment.

addisoncrump commented 2 months ago

I'm really not sure how it could be. I think it will require manual inspection to understand the root cause here. I don't really understand why this would work locally but not in the cloud environment for these reasons, since we should expect the same errors.

Looking at the changeset: I don't see why anything I did would have affected this, especially since we see inconsistent generation of reports. The only thing I can think of that might cause this would be rate limiting with the bucket or similar.

DonggeLiu commented 2 months ago

Looking at the changeset: I don't see why anything I did would have affected this, especially since we see inconsistent generation of reports. The only thing I can think of that might cause this would be rate limiting with the bucket or similar.

Yep, I did not think of any reason from this PR either. Yet this seems to be the only place that we can reproduce the no report error: I was trying to reproduce the Fuzz target binary not found. error in this PR, but it did not work either.

~~Could you please cherry-pick commits from #2038, or rebase your PR on it? Hopefully those commits can help us understand the cause.~~

Never mind, I created #2039 for this to keep your PR clean.

This is weird: With the same commits, that experiment works.

Let's wait a bit longer and if that experiments proves the error is flaky, we should be able to merge this. Not sure we can can consistently reproduce it here though, maybe because we run two experiments together?

addisoncrump commented 1 month ago

Hey, what remains for this PR? We settled that the flakiness was not associated to this PR, unless I'm mistaken.

DonggeLiu commented 1 month ago

Hey, what remains for this PR? We settled that the flakiness was not associated to this PR, unless I'm mistaken.

You are right, I will merge this.

DonggeLiu commented 1 month ago

/gcbexp skip

addisoncrump commented 4 weeks ago

Any chance we could run another baselines experiment with this? Would be good to have this data in the pocket.

DonggeLiu commented 4 weeks ago

Any chance we could run another baselines experiment with this? Would be good to have this data in the pocket.

Yep sure! Running at a new PR as this PR has been merged.