Closed addisoncrump closed 1 month ago
@DonggeLiu Can we try to do a baseline experiment with this PR again? :slightly_smiling_face: It is fully rebased to the latest changes.
I will integrate the analysis changes once there is a public baseline to point the analysis example at.
/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-08-17-2028-bases-1 --fuzzers afl aflplusplus libafl libfuzzer
Hi @addisoncrump,I started a test exp above.
Experiment 2024-08-17-2028-bases-1
data and results will be available later at:
The experiment data.
The experiment report.
The experiment report(experimental).
If it works well and you'd like to run a full exp (23 hours), could you please rebase to adopt this change? I forgot to revert temp changes in a previous PR.
Thanks!
Rebased. The experiment looks good, all the coverage samples were archived.
/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-08-19-2028-bases-1 --fuzzers afl aflplusplus libafl libfuzzer
Experiment 2024-08-19-2028-bases-1
data and results will be available later at:
The experiment data.
The experiment report.
The experiment report(experimental).
It seems to still not be hitting the measurer...
This is a really strange, because 2024-08-19-2028-bases-1
has a list of errors about merging coverage summary:
But 2024-08-17-2028-bases-1
did not have any:
QQ: Is the only thing change between those 2 experiments?
BTW, I noticed this runtime crash in libafl. I don't think it could cause the failure, but it might be interesting to you: https://storage.googleapis.com/fuzzbench-data/index.html?prefix=2024-08-19-2028-bases-1/experiment-folders/libxml2_xml-libafl/trial-3070882/results/
It did not happen in 2024-08-17-2028-bases-1, maybe because that experiment was very short?
A possible theory: libafl saved some input into its corpus during this crash, which caused measurement failure?
@tokatoka random libafl crash :upside_down_face:
I confirmed that the only difference is that commit, yes.
Let's add some more debugging and run a very short run with all the benchmarks, I guess?
ohh i see. so this is why my experiment didn't complete either
A possible theory: libafl saved some input into its corpus during this crash, which caused measurement failure?
but it should not affect other fuzzers such as aflplusplus runs right?
@tokatoka random libafl crash 🙃
can you reproduce? i used the same setup on fuzzbench but cannot reproduce
I updated. @DonggeLiu Could you run the same command again to see if it fixes the problem or not?
/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-08-22-2028-bases-1 --fuzzers libafl
A possible theory: libafl saved some input into its corpus during this crash, which caused measurement failure?
Also running an experiment without libafl
to help verify this theory.
/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-08-22-2028-bases-2 --fuzzers afl aflplusplus libfuzzer
Experiment 2024-08-22-2028-bases-1
data and results will be available later at:
The experiment data.
The experiment report.
The experiment report(experimental).
Experiment 2024-08-22-2028-bases-2
data and results will be available later at:
The experiment data.
The experiment report.
The experiment report(experimental).
bases-1 seems to be working fine, but bases-2 is not hitting the measurer still.
so it looks like the libafl crash is not the cause of this
btw for base-1 it seems all the fuzzers are stuck after 10:45m so it was not a successful run either...
Ops there were a DB issue yesterday which affected both experiments. Let me re-run them
/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-08-23-2028-libafl --fuzzers libafl
/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-08-23-2028-bases --fuzzers afl aflplusplus libfuzzer
we have experiment-folder but not report. so the measurement is still broken
For the report: 2024-08-23-2028-libafl and 2024-08-23-2028-bases is missing but 2024-08-23-2036-bases-1 (from the other PR) is there
For the experiment-data: nothing is missing.
Btw if the experiment on my branch and https://www.fuzzbench.com/reports/experimental/2024-08-23-dgfuzz/index.html ← this experiment are working. would it be possible that the changes in this PR caused the measurement failure??
Experiment 2024-08-23-2028-libafl
data and results will be available later at:
The experiment data.
The experiment report.
The experiment report(experimental).
Experiment 2024-08-23-2028-bases
data and results will be available later at:
The experiment data.
The experiment report.
The experiment report(experimental).
@addisoncrump would this happen to be related to this PR?
It could be due to this error:
I think the gsutil rm
error is at least benign, because @tokatoka shows 2024-08-23-2036-bases-1
can generate a report:
Btw if the experiment on my branch and https://www.fuzzbench.com/reports/experimental/2024-08-23-dgfuzz/index.html ← this experiment are working. would it be possible that the changes in this PR caused the measurement failure??
and it also has gsutil rm
error, but not llvm-profdata
error:
There was another build error (discussed in #2038, as shown above), but I am sure that one is benign and unrelated to the missing report in the experiment.
I'm really not sure how it could be. I think it will require manual inspection to understand the root cause here. I don't really understand why this would work locally but not in the cloud environment for these reasons, since we should expect the same errors.
Looking at the changeset: I don't see why anything I did would have affected this, especially since we see inconsistent generation of reports. The only thing I can think of that might cause this would be rate limiting with the bucket or similar.
Looking at the changeset: I don't see why anything I did would have affected this, especially since we see inconsistent generation of reports. The only thing I can think of that might cause this would be rate limiting with the bucket or similar.
Yep, I did not think of any reason from this PR either. Yet this seems to be the only place that we can reproduce the no report error: I was trying to reproduce the Fuzz target binary not found.
error in this PR, but it did not work either.
~~Could you please cherry-pick commits from #2038, or rebase your PR on it? Hopefully those commits can help us understand the cause.~~
Never mind, I created #2039 for this to keep your PR clean.
This is weird: With the same commits, that experiment works.
Let's wait a bit longer and if that experiments proves the error is flaky, we should be able to merge this. Not sure we can can consistently reproduce it here though, maybe because we run two experiments together?
Hey, what remains for this PR? We settled that the flakiness was not associated to this PR, unless I'm mistaken.
Hey, what remains for this PR? We settled that the flakiness was not associated to this PR, unless I'm mistaken.
You are right, I will merge this.
/gcbexp skip
Any chance we could run another baselines experiment with this? Would be good to have this data in the pocket.
Supercedes #2020. Moving so we (AFL++ people) can collaborate on this PR.
From the original: