google / fuzzbench

FuzzBench - Fuzzer benchmarking as a service.
https://google.github.io/fuzzbench/
Apache License 2.0
1.11k stars 270 forks source link

Fix recent FuzzBench cloud experiment failures #2023

Closed DonggeLiu closed 3 months ago

DonggeLiu commented 3 months ago
  1. Fix TypeError: expected str, bytes or os.PathLike object, not NoneType in 2024-08-10-test.

    Traceback (most recent call last):
    File "/src/experiment/runner.py", line 468, in experiment_main
    runner.conduct_trial()
    File "/src/experiment/runner.py", line 290, in conduct_trial
    self.set_up_corpus_directories()
    File "/src/experiment/runner.py", line 275, in set_up_corpus_directories
    _unpack_clusterfuzz_seed_corpus(target_binary, input_corpus)
    File "/src/experiment/runner.py", line 144, in _unpack_clusterfuzz_seed_corpus
    seed_corpus_archive_path = get_clusterfuzz_seed_corpus_path(
    File "/src/experiment/runner.py", line 98, in get_clusterfuzz_seed_corpus_path
    fuzz_target_without_extension = os.path.splitext(fuzz_target_path)[0]
    File "/usr/local/lib/python3.10/posixpath.py", line 118, in splitext
    p = os.fspath(p)
    TypeError: expected str, bytes or os.PathLike object, not NoneType

    This happens on many benchmarks+fuzzers. To be investigated later:

  2. Why fuzz_target_path is None.

  3. Why this did not happen in other recent experiments.

  4. I thought I had seen this a long ago, Déjà vu?

  5. Fixing No such file or directory: '/work/measurement-folders/<benchmark>-<fuzzer>/merged.json:

    Traceback (most recent call last):
    File "/work/src/experiment/measurer/coverage_utils.py", line 74, in generate_coverage_report
    coverage_reporter.generate_coverage_summary_json()
    File "/work/src/experiment/measurer/coverage_utils.py", line 141, in generate_coverage_summary_json
    result = generate_json_summary(coverage_binary,
    File "/work/src/experiment/measurer/coverage_utils.py", line 269, in generate_json_summary
    with open(output_file, 'w', encoding='utf-8') as dst_file:
    FileNotFoundError: [Errno 2] No such file or directory: '/work/measurement-folders/lcms_cms_transform_fuzzer-centipede/merged.json'
  6. Remove incompatible benchmarks: openh264_decoder_fuzzer, stb_stbi_read_fuzzer

DonggeLiu commented 3 months ago

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-08-12-dg --fuzzers aflplusplus centipede honggfuzz libfuzzer --benchmarks stb_stbi_read_fuzzer openh264_decoder_fuzzer

DonggeLiu commented 3 months ago

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-08-12-2023 --fuzzers aflplusplus centipede honggfuzz libfuzzer --benchmarks stb_stbi_read_fuzzer openh264_decoder_fuzzer

DonggeLiu commented 3 months ago

Experiment 2024-08-12-2023 data and results will be available later at: The experiment data. The experiment report. The experiment report(experimental).

DonggeLiu commented 3 months ago

This failed likely because both fuzz targets failed to generate coverage repots, e.g.: image

Not sure if this related: OSS-Fuzz's build status page shows openh264_decoder_fuzzer failed.

DonggeLiu commented 3 months ago

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-08-13-2023-libfuzzer-1 --fuzzers libfuzzer

DonggeLiu commented 3 months ago

Experiment 2024-08-13-2023-libfuzzer-1 data and results will be available later at: The experiment data. The experiment report. The experiment report(experimental).

DonggeLiu commented 3 months ago

Report is back : ) @addisoncrump I will wait a bit longer before merging this to ensure the report stays alive. Once I merge this to master, could you please update your PR and bring back the changes you added? Thanks!

addisoncrump commented 3 months ago

Sure, I'll rebase.

addisoncrump commented 3 months ago

@DonggeLiu I am able to build both openh264 and stb_stbi fuzzers as in master locally with no issue. Like #2021, I think this is a cache issue.

DonggeLiu commented 3 months ago

@DonggeLiu I am able to build both openh264 and stb_stbi fuzzers as in master locally with no issue. Like #2021, I think this is a cache issue.

I see, thanks for the info! Given that you are investigating this, is there any help I can provide? For example, if you think some more cloud build logs can save you time debugging, please feel free to add them and request experiments. I can run them for you and send you the related logs.

DonggeLiu commented 3 months ago

Report on this PR is still not ready, likely due to some VMs were preemptied. I will give it one more day just to be 100% safe.

addisoncrump commented 3 months ago

Given that you are investigating this, is there any help I can provide?

Ah, I was investigating the specific issue with the bug benchmark. I don't think I can offer much help with the CI or the fuzzbench infra directly. I can say, however, that the coverage benchmarks you removed do work as expected locally with test-run. I need to check if the coverage measurer works as anticipated; maybe this needs to be updated instead.

addisoncrump commented 3 months ago

Ah, @DonggeLiu, try running make test-run-coverage-all. It complains that it can't find bloaty_fuzz_target on master :eyes:

tokatoka commented 3 months ago

@DonggeLiu I am able to build both openh264 and stb_stbi fuzzers as in master locally with no issue. Like https://github.com/google/fuzzbench/pull/2021, I think this is a cache issue.

For me the same, they are working. I don't think they should be removed

DonggeLiu commented 3 months ago

I see, thanks @addisoncrump and @tokatoka . I've brought them back.

The experiment is about to finish, I will merge this tmr morning.

addisoncrump commented 3 months ago

I confirmed the coverage measurers build locally as well. Will test when everything has finished building.

addisoncrump commented 3 months ago

Yup, I tested openh264 and stb benchmarks locally and they do perform measurements as anticipated. The issue is with the GCP runs, I would presume a build cache issue.

DonggeLiu commented 3 months ago

Yup, I tested openh264 and stb benchmarks locally and they do perform measurements as anticipated. The issue is with the GCP runs, I would presume a build cache issue.

I see, I reckon this could be due to impatible GCP vm environment and llvm? I will look into this once I finish other tasks in hand.

Just to double-check @addisoncrump : When you test them locally, did you remove their old local images beforehand?

DonggeLiu commented 3 months ago

Thanks for the information again, @addisoncrump!

DonggeLiu commented 3 months ago

TBR by @jonathanmetzman.

The experiment that proving this works: https://github.com/google/fuzzbench/pull/2023#issuecomment-2285147301

addisoncrump commented 3 months ago

When you test them locally, did you remove their old local images beforehand?

Yes, I do a docker system prune --all before every experiment.

DonggeLiu commented 3 months ago

When you test them locally, did you remove their old local images beforehand?

Yes, I do a docker system prune --all before every experiment.

I see, thanks for confirming. I will merge this then.

tokatoka commented 2 months ago

I thought I had seen this a long ago, Déjà vu?

The same bug happened 1 year ago https://github.com/google/fuzzbench/pull/1886

DonggeLiu commented 2 months ago

The same bug happened 1 year ago #1886

Thanks for noticing this, let me see if @jonathanmetzman has more insight once he is back.

addisoncrump commented 2 months ago

Just to reiterate, this is a major threat to validity -- especially when cached data is used. The cache completely overwrites the report, so the final report generated is simply showing only the last successful experiment. This effectively invalidates all future Fuzzbench reports until this issue is resolved.

I think the report generation issue indicates that safeguards should be put in place that simply terminate the experiment in such degenerative cases, since the results are effectively guaranteed to be invalid.