google / fuzzbench

FuzzBench - Fuzzer benchmarking as a service.
https://google.github.io/fuzzbench/
Apache License 2.0
1.11k stars 270 forks source link

Error generating HTML report. #1896

Open krz-max opened 1 year ago

krz-max commented 1 year ago

I was running local experiment using fuzzers aflplusplus and benchmarks curl_curl_fuzzer_http and bloaty_fuzz_target I pass the make presubmit after installing qtbase-dev5 mentioned in this issue #1867

My experiment was successful before(about 2 weeks ago) but after I reinstall and execute again, it keeps reporting as follow: (I think the WARNING is fine because there are same warnings even when my experiment is successful)

INFO:root:Measuring all trials.
INFO:root:Measuring cycle: 1. Extras: {'fuzzer': 'aflplusplus', 'benchmark': 'curl_curl_fuzzer_http', 'trial_id': '3', 'cycle': '1'}
WARNING:root:Corpus not found for cycle: 1. Extras: {'fuzzer': 'aflplusplus', 'benchmark': 'curl_curl_fuzzer_http', 'trial_id': '3', 'cycle': '1'}
INFO:root:Done measuring all trials.
INFO:root:Measuring all trials.
INFO:root:Measuring cycle: 1. Extras: {'fuzzer': 'aflplusplus', 'benchmark': 'curl_curl_fuzzer_http', 'trial_id': '3', 'cycle': '1'}
INFO:root:Measured cycle: 1 in 9.500000 seconds. Extras: {'fuzzer': 'aflplusplus', 'benchmark': 'curl_curl_fuzzer_http', 'trial_id': '3', 'cycle': '1'}
INFO:root:Measuring cycle: 2. Extras: {'fuzzer': 'aflplusplus', 'benchmark': 'curl_curl_fuzzer_http', 'trial_id': '3', 'cycle': '2'}
WARNING:root:Corpus not found for cycle: 2. Extras: {'fuzzer': 'aflplusplus', 'benchmark': 'curl_curl_fuzzer_http', 'trial_id': '3', 'cycle': '2'}
INFO:root:Done measuring all trials.
INFO:root:Measuring all trials.
INFO:root:Measuring cycle: 2. Extras: {'fuzzer': 'aflplusplus', 'benchmark': 'curl_curl_fuzzer_http', 'trial_id': '3', 'cycle': '2'}
WARNING:root:Corpus not found for cycle: 2. Extras: {'fuzzer': 'aflplusplus', 'benchmark': 'curl_curl_fuzzer_http', 'trial_id': '3', 'cycle': '2'}
INFO:root:Done measuring all trials.
INFO:root:Measuring all trials.
INFO:root:Measuring cycle: 2. Extras: {'fuzzer': 'aflplusplus', 'benchmark': 'curl_curl_fuzzer_http', 'trial_id': '3', 'cycle': '2'}
WARNING:root:Corpus not found for cycle: 2. Extras: {'fuzzer': 'aflplusplus', 'benchmark': 'curl_curl_fuzzer_http', 'trial_id': '3', 'cycle': '2'}
INFO:root:Done measuring all trials.
INFO:root:Measuring all trials.
INFO:root:Measuring cycle: 2. Extras: {'fuzzer': 'aflplusplus', 'benchmark': 'curl_curl_fuzzer_http', 'trial_id': '3', 'cycle': '2'}
WARNING:root:Corpus not found for cycle: 2. Extras: {'fuzzer': 'aflplusplus', 'benchmark': 'curl_curl_fuzzer_http', 'trial_id': '3', 'cycle': '2'}
INFO:root:Done measuring all trials.
INFO:root:Measuring all trials.
INFO:root:Measuring cycle: 2. Extras: {'fuzzer': 'aflplusplus', 'benchmark': 'curl_curl_fuzzer_http', 'trial_id': '3', 'cycle': '2'}
WARNING:root:Corpus not found for cycle: 2. Extras: {'fuzzer': 'aflplusplus', 'benchmark': 'curl_curl_fuzzer_http', 'trial_id': '3', 'cycle': '2'}
INFO:root:Done measuring all trials.
INFO:root:Measuring all trials.
INFO:root:Measuring cycle: 2. Extras: {'fuzzer': 'aflplusplus', 'benchmark': 'curl_curl_fuzzer_http', 'trial_id': '3', 'cycle': '2'}
WARNING:root:Corpus not found for cycle: 2. Extras: {'fuzzer': 'aflplusplus', 'benchmark': 'curl_curl_fuzzer_http', 'trial_id': '3', 'cycle': '2'}
INFO:root:Done measuring all trials.
INFO:root:Measuring all trials.
INFO:root:Measuring cycle: 2. Extras: {'fuzzer': 'aflplusplus', 'benchmark': 'curl_curl_fuzzer_http', 'trial_id': '3', 'cycle': '2'}
WARNING:root:Corpus not found for cycle: 2. Extras: {'fuzzer': 'aflplusplus', 'benchmark': 'curl_curl_fuzzer_http', 'trial_id': '3', 'cycle': '2'}
INFO:root:Done measuring all trials.
INFO:root:Measuring all trials.
INFO:root:Measuring cycle: 2. Extras: {'fuzzer': 'aflplusplus', 'benchmark': 'curl_curl_fuzzer_http', 'trial_id': '3', 'cycle': '2'}
WARNING:root:Corpus not found for cycle: 2. Extras: {'fuzzer': 'aflplusplus', 'benchmark': 'curl_curl_fuzzer_http', 'trial_id': '3', 'cycle': '2'}
INFO:root:Done measuring all trials.
INFO:root:Measuring all trials.
INFO:root:Measuring cycle: 2. Extras: {'fuzzer': 'aflplusplus', 'benchmark': 'curl_curl_fuzzer_http', 'trial_id': '3', 'cycle': '2'}
WARNING:root:Corpus not found for cycle: 2. Extras: {'fuzzer': 'aflplusplus', 'benchmark': 'curl_curl_fuzzer_http', 'trial_id': '3', 'cycle': '2'}
INFO:root:Done measuring all trials.
INFO:root:In progress: True.
INFO:root:Is merging with nonprivate: False.
INFO:root:Reading experiment data from db.
INFO:root:Done reading experiment data from db.
WARNING:root:Filtered out invalid benchmarks: set().
ERROR:root:Error generating HTML report. Extras: {'traceback': 'Traceback (most recent call last):\n  File "/usr/local/lib/python3.10/site-packages/pandas/core/frame.py", line 11003, in _reindex_for_setitem\n    reindexed_value = value.reindex(index)._values\n  File "/usr/local/lib/python3.10/site-packages/pandas/util/_decorators.py", line 324, in wrapper\n    return func(*args, **kwargs)\n  File "/usr/local/lib/python3.10/site-packages/pandas/core/frame.py", line 4807, in reindex\n    return super().reindex(**kwargs)\n  File "/usr/local/lib/python3.10/site-packages/pandas/core/generic.py", line 4966, in reindex\n    return self._reindex_axes(\n  File "/usr/local/lib/python3.10/site-packages/pandas/core/frame.py", line 4626, in _reindex_axes\n    frame = frame._reindex_index(\n  File "/usr/local/lib/python3.10/site-packages/pandas/core/frame.py", line 4642, in _reindex_index\n    new_index, indexer = self.index.reindex(\n  File "/usr/local/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 4237, in reindex\n    target = self._wrap_reindex_result(target, indexer, preserve_names)\n  File "/usr/local/lib/python3.10/site-packages/pandas/core/indexes/multi.py", line 2520, in _wrap_reindex_result\n    target = MultiIndex.from_tuples(target)\n  File "/usr/local/lib/python3.10/site-packages/pandas/core/indexes/multi.py", line 204, in new_meth\n    return meth(self_or_cls, *args, **kwargs)\n  File "/usr/local/lib/python3.10/site-packages/pandas/core/indexes/multi.py", line 559, in from_tuples\n    arrays = list(lib.tuples_to_object_array(tuples).T)\n  File "pandas/_libs/lib.pyx", line 2930, in pandas._libs.lib.tuples_to_object_array\nValueError: Buffer dtype mismatch, expected \'Python object\' but got \'long\'\n\nThe above exception was the direct cause of the following exception:\n\nTraceback (most recent call last):\n  File "/work/src/experiment/reporter.py", line 76, in output_report\n    generate_report.generate_report(\n  File "/work/src/analysis/generate_report.py", line 235, in generate_report\n    experiment_df = data_utils.add_bugs_covered_column(experiment_df)\n  File "/work/src/analysis/data_utils.py", line 162, in add_bugs_covered_column\n    df[\'firsts\'] = (\n  File "/usr/local/lib/python3.10/site-packages/pandas/core/frame.py", line 3645, in __setitem__\n    self._set_item_frame_value(key, value)\n  File "/usr/local/lib/python3.10/site-packages/pandas/core/frame.py", line 3787, in _set_item_frame_value\n    arraylike = _reindex_for_setitem(value, self.index)\n  File "/usr/local/lib/python3.10/site-packages/pandas/core/frame.py", line 11010, in _reindex_for_setitem\n    raise TypeError(\nTypeError: incompatible index of inserted column with frame index\n'}

I'm not sure why it directly print out \n, so I tried to format the error message below for better reference:

'traceback': 'Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/pandas/core/frame.py", line 11003, in _reindex_for_setitem
    reindexed_value = value.reindex(index)._values
File "/usr/local/lib/python3.10/site-packages/pandas/util/_decorators.py", line 324, in wrapper
    return func(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/pandas/core/frame.py", line 4807, in reindex
    return super().reindex(**kwargs)
File "/usr/local/lib/python3.10/site-packages/pandas/core/generic.py", line 4966, in reindex
    return self._reindex_axes(
File "/usr/local/lib/python3.10/site-packages/pandas/core/frame.py", line 4626, in _reindex_axes
    frame = frame._reindex_index(
File "/usr/local/lib/python3.10/site-packages/pandas/core/frame.py", line 4642, in _reindex_index
    new_index, indexer = self.index.reindex(
File "/usr/local/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 4237, in reindex
    target = self._wrap_reindex_result(target, indexer, preserve_names)
File "/usr/local/lib/python3.10/site-packages/pandas/core/indexes/multi.py", line 2520, in _wrap_reindex_result
    target = MultiIndex.from_tuples(target)
File "/usr/local/lib/python3.10/site-packages/pandas/core/indexes/multi.py", line 204, in new_meth
    return meth(self_or_cls, *args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/pandas/core/indexes/multi.py", line 559, in from_tuples
    arrays = list(lib.tuples_to_object_array(tuples).T)
File "pandas/_libs/lib.pyx", line 2930, in pandas._libs.lib.tuples_to_object_array
ValueError: Buffer dtype mismatch, expected 'Python object' but got  'long'
The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/work/src/experiment/reporter.py", line 76, in output_report
    generate_report.generate_report(
File "/work/src/analysis/generate_report.py", line 235, in generate_report
    experiment_df = data_utils.add_bugs_covered_column(experiment_df)
File "/work/src/analysis/data_utils.py", line 162, in add_bugs_covered_column
    df[\'firsts\'] = (
File "/usr/local/lib/python3.10/site-packages/pandas/core/frame.py", line 3645, in __setitem__
    self._set_item_frame_value(key, value)
File "/usr/local/lib/python3.10/site-packages/pandas/core/frame.py", line 3787, in _set_item_frame_value
    arraylike = _reindex_for_setitem(value, self.index)
File "/usr/local/lib/python3.10/site-packages/pandas/core/frame.py", line 11010, in _reindex_for_setitem
    raise TypeError(\nTypeError: incompatible index of inserted column with frame index\n'

I've removed ubuntu and set up the environment several times but it does not work. My environment:

Ubuntu 22.04.2 LTS
python version : 3.10.12

I download the docker from here Appreciate for the help!

jonathanmetzman commented 1 year ago

Sorry for the late response, I don't have a great guess. Do you want to email me [my last name]@chromium.org the data.csv.gz so I can try to reproduce?

krz-max commented 1 year ago

Thanks a lot for the reply, but due to the error, report data isn't generated automatically after the experiment ends.

So, I modify the funcion, add_bugs_covered_column(experiment_df) in /fuzzbench/analysis/data_utils.py around line 153

def add_bugs_covered_column(experiment_df):
    """Return a modified experiment df in which adds a |bugs_covered| column,
    a cumulative count of bugs covered over time."""
    # Immediately return to avoid the code producing bugs
    experiment_df['bugs_covered'] = 0
    return experiment_df
    """
    if 'crash_key' not in experiment_df:
        experiment_df['bugs_covered'] = 0
        return experiment_df
    """
    grouping2 = ['fuzzer', 'benchmark', 'trial_id']
    grouping3 = ['fuzzer', 'benchmark', 'trial_id', 'time']
    df = experiment_df.sort_values(grouping3)
    # Bug Here
    df['firsts'] = (
        df.groupby(grouping2, group_keys=False).apply(is_unique_crash) &
        ~df.crash_key.isna())
    df['bugs_cumsum'] = df.groupby(grouping2)['firsts'].transform('cumsum')
    df['bugs_covered'] = (
        df.groupby(grouping3)['bugs_cumsum'].transform('max').astype(int))
    new_df = df.drop(columns=['bugs_cumsum', 'firsts'])
    return new_df

I will reproduce my experiment using the modified function and send it to you later. If you want to run it yourself, my execution command is something like:

PYTHONPATH=. python3 experiment/run_experiment.py -b freetype2_ftfuzzer -c experiment-config.yaml -e <experiment name> -f aflplusplus

And my experiment-config.yaml:

# The number of trials of a fuzzer-benchmark pair.
trials: 1

# The amount of time in seconds that each trial is run for.
# 1 day = 24 * 60 * 60 = 86400
max_total_time: 7200

# The location of the docker registry.
# FIXME: Support custom docker registry.
# See https://github.com/google/fuzzbench/issues/777
docker_registry: gcr.io/fuzzbench

# The local experiment folder that will store most of the experiment data.
# Please use an absolute path.
experiment_filestore: /tmp/experiment-data

# The local report folder where HTML reports and summary data will be stored.
# Please use an absolute path.
report_filestore: /tmp/report-data

# Flag that indicates this is a local experiment.
local_experiment: true

Additional Information:

I am not sure if these would help.

FatPigeorz commented 1 year ago

same issue, have you fixed it?

steven-hh-ding commented 3 months ago

Ran into the same problem. If you ran two fuzzer on a benchmark, it magically went away though..

steven-hh-ding commented 3 months ago

 df['firsts'] = (
        df.groupby(grouping2, group_keys=False).apply(is_unique_crash) &
        ~df.crash_key.isna())

The problem is that group_keys=False is not working here when there is only one entry, df.groupby(grouping2, group_keys=False).apply(is_unique_crash) outputs:

Name: firsts, dtype: bool
c1 firsts                                   0
fuzzer          benchmark   trial_id
aflplusplus libxml2_xml 58        True

It includes group key rather than the expected output to work

0    True
Name: firsts, dtype: bool

Therefore, the included group key does not fit the original index of df, raising incompatible index of inserted column with frame index

My temporary fix:


    df = experiment_df.sort_values(grouping3)
    c1 = df.groupby(grouping2, group_keys=False).apply(is_unique_crash)
    c2 = ~df.crash_key.isna()
    c1.index = c2.index
    df['firsts'] = c1 & c2