Open phi-go opened 1 year ago
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).
View this failed invocation of the CLA check for more information.
For the most up to date status, view the checks section at the bottom of the pull request.
Hey @Alan32Liu, the current commit should hopefully pass the CI checks, though, we are still encountering some errors for the integration tests with our setup but I believe this is just caused by the differences in the environment. We have merged the fixes for https://github.com/google/fuzzbench/pull/1937, thank you for your help there. The current commit is still a work in progress regarding report generation and currently only 10 mutant binaries are built to keep testing times low.
We added the more granular timestamps as well, see changes to the extract_corpus()
and archive_corpus()
functions.
If the CI checks pass could you start a gcbrun? I think the following command should be correct:
/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2023-12-23-mua-measurer --fuzzers aflplusplus aflplusplus_407 --mutation-analysis
We also encounter two errors that we are quite confused by, maybe you have an idea?
One is part of the integration tests:
Here: experiment/test_runner.py::TestIntegrationRunner::test_integration_runner
fails for the following assert:
assert len(os.listdir(output_corpus_dir)) > 5
The output_corpus_dir
instead contains a corpus
and crashes
dir, where the corpus
dir does contain more than 5 files. The fuzzer-log.txt
contains the following warnings, could these explain the issue?:
[run_fuzzer] Running command: /home/pgoerz/fuzzbench/experiment/test_data/test_runner/MultipleConstraintsOnSmallInputTest -print_final_stats=1 -close_fd_mask=3 -fork=1 -ignore_ooms=1 -ignore_timeouts=1 -ignore_crashes=1 -entropic=1 -keep_seed=1 -cross_over_uniform_dist=1 -entropi>
WARNING: unrecognized flag '-fork=1'; use -help=1 to list all flags
WARNING: unrecognized flag '-ignore_ooms=1'; use -help=1 to list all flags
WARNING: unrecognized flag '-ignore_timeouts=1'; use -help=1 to list all flags
WARNING: unrecognized flag '-ignore_crashes=1'; use -help=1 to list all flags
WARNING: unrecognized flag '-entropic=1'; use -help=1 to list all flags
WARNING: unrecognized flag '-keep_seed=1'; use -help=1 to list all flags
WARNING: unrecognized flag '-cross_over_uniform_dist=1'; use -help=1 to list all flags
WARNING: unrecognized flag '-entropic_scale_per_exec_time=1'; use -help=1 to list all flags
INFO: Seed: 3421389961 INFO: Loaded 1 modules (1645 inline 8-bit counters): 1645 [0x48ef70, 0x48f5dd),
INFO: Loaded 1 PC tables (1645 PCs): 1645 [0x472f58,0x479628),
INFO: 0 files found in /tmp/pytest-of-pgoerz/pytest-51/test_integration_runner0/corpus/corpus
INFO: 0 files found in /tmp/pytest-of-pgoerz/pytest-51/test_integration_runner0/seeds
INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes
INFO: A corpus is not provided, starting from an empty corpus #2
INITED cov: 4 ft: 5 corp: 1/1b lim: 4 exec/s: 0 rss: 23Mb
NEW_FUNC[1/1]: ==458257==WARNING: invalid path to external symbolizer! ==458257==
WARNING: Failed to use and restart external symbolizer! 0x450a10 (/home/pgoerz/fuzzbench/experiment/test_data/test_runner/MultipleConstraintsOnSmallInputTest+0x450a10) #2062
NEW cov: 7 ft: 8 corp: 2/21b lim: 21 exec/s: 0 rss: 26Mb L: 20/20 MS: 5 InsertRepeatedBytes-CMP-InsertByte-InsertRepeatedBytes-InsertByte- DE: "\x01\x00\x00\x00\x00\x00\x00\x14"-
NEW_FUNC[1/4]: 0x450aa0 (/home/pgoerz/fuzzbench/experiment/test_data/test_runner/MultipleConstraintsOnSmallInputTest+0x450aa0)
NEW_FUNC[2/4]: 0x450b30 (/home/pgoerz/fuzzbench/experiment/test_data/test_runner/MultipleConstraintsOnSmallInputTest+0x450b30) #2240
NEW cov: 17 ft: 18 corp: 3/41b lim: 21 exec/s: 0 rss: 26Mb L: 20/20 MS: 3 CopyPart-CMP-CopyPart- DE: "\x01\x00"-
The other happens during report generation, which we tried to debug but are not even sure what of our changes could have even caused it:
INFO:root:experiment_df:
git_hash experiment_filestore ... fuzzer_stats crash_key
0 243d1022080f665c6755bc0aff12f8b4d43d2098 /tmp/experiment-data ... None None
1 243d1022080f665c6755bc0aff12f8b4d43d2098 /tmp/experiment-data ... None None
2 243d1022080f665c6755bc0aff12f8b4d43d2098 /tmp/experiment-data ... None None
[3 rows x 12 columns]
WARNING:root:Filtered out invalid benchmarks: set().
git_hash experiment_filestore ... fuzzer_stats crash_key
0 243d1022080f665c6755bc0aff12f8b4d43d2098 /tmp/experiment-data ... None None
1 243d1022080f665c6755bc0aff12f8b4d43d2098 /tmp/experiment-data ... None None
2 243d1022080f665c6755bc0aff12f8b4d43d2098 /tmp/experiment-data ... None None
[3 rows x 12 columns]
Int64Index([0, 1, 2], dtype='int64')
git_hash experiment_filestore ... crash_key firsts
0 243d1022080f665c6755bc0aff12f8b4d43d2098 /tmp/experiment-data ... None True
1 243d1022080f665c6755bc0aff12f8b4d43d2098 /tmp/experiment-data ... None True
2 243d1022080f665c6755bc0aff12f8b4d43d2098 /tmp/experiment-data ... None True
[3 rows x 13 columns]
Int64Index([0, 1, 2], dtype='int64')
ERROR:root:Error generating HTML report. Extras:
traceback: Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/pandas/core/frame.py", line 11003, in _reindex_for_setitem
reindexed_value = value.reindex(index)._values
File "/usr/local/lib/python3.10/site-packages/pandas/util/_decorators.py", line 324, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/pandas/core/frame.py", line 4807, in reindex
return super().reindex(**kwargs)
File "/usr/local/lib/python3.10/site-packages/pandas/core/generic.py", line 4966, in reindex
return self._reindex_axes(
File "/usr/local/lib/python3.10/site-packages/pandas/core/frame.py", line 4626, in _reindex_axes
frame = frame._reindex_index(
File "/usr/local/lib/python3.10/site-packages/pandas/core/frame.py", line 4642, in _reindex_index
new_index, indexer = self.index.reindex(
File "/usr/local/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 4237, in reindex
target = self._wrap_reindex_result(target, indexer, preserve_names)
File "/usr/local/lib/python3.10/site-packages/pandas/core/indexes/multi.py", line 2520, in _wrap_reindex_result
target = MultiIndex.from_tuples(target)
File "/usr/local/lib/python3.10/site-packages/pandas/core/indexes/multi.py", line 204, in new_meth
return meth(self_or_cls, *args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/pandas/core/indexes/multi.py", line 559, in from_tuples
arrays = list(lib.tuples_to_object_array(tuples).T)
File "pandas/_libs/lib.pyx", line 2930, in pandas._libs.lib.tuples_to_object_array
ValueError: Buffer dtype mismatch, expected 'Python object' but got 'long'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/work/src/experiment/reporter.py", line 82, in output_report
generate_report.generate_report(
File "/work/src/analysis/generate_report.py", line 269, in generate_report
experiment_df = data_utils.add_bugs_covered_column(experiment_df)
File "/work/src/analysis/data_utils.py", line 170, in add_bugs_covered_column
df['firsts'] = (
File "/usr/local/lib/python3.10/site-packages/pandas/core/frame.py", line 3645, in __setitem__
self._set_item_frame_value(key, value)
File "/usr/local/lib/python3.10/site-packages/pandas/core/frame.py", line 3787, in _set_item_frame_value
arraylike = _reindex_for_setitem(value, self.index)
File "/usr/local/lib/python3.10/site-packages/pandas/core/frame.py", line 11010, in _reindex_for_setitem
raise TypeError(
TypeError: incompatible index of inserted column with frame index
/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2023-12-23-mua-measurer --fuzzers aflplusplus aflplusplus_407 --mutation-analysis
We also encounter two errors that we are quite confused by, maybe you have an idea?
I will list my initial thoughts below, but I did not have a chance to read your code thoroughly and may make mistakes. @jonathanmetzman please correct me if I am wrong.
One is part of the integration tests: Here:
experiment/test_runner.py::TestIntegrationRunner::test_integration_runner
fails for the following assert:assert len(os.listdir(output_corpus_dir)) > 5
Theoutput_corpus_dir
instead contains acorpus
andcrashes
dir, where thecorpus
dir does contain more than 5 files. Thefuzzer-log.txt
contains the following warnings, could these explain the issue?:
It seems the corpus/
was not correctly defined by your fuzzer (BTW, would it be better to change the copyright info?).
Here is an example from libfuzzer: https://github.com/google/fuzzbench/blob/2bc06d4aca2c8dae20eeca47b95433ef543f99a8/fuzzers/libfuzzer/fuzzer.py#L49-L52
The other happens during report generation, which we tried to debug but are not even sure what of our changes could have even caused it:
Traceback (most recent call last): File "/work/src/experiment/reporter.py", line 82, in output_report generate_report.generate_report( File "/work/src/analysis/generate_report.py", line 269, in generate_report experiment_df = data_utils.add_bugs_covered_column(experiment_df) File "/work/src/analysis/data_utils.py", line 170, in add_bugs_covered_column df['firsts'] = ( File "/usr/local/lib/python3.10/site-packages/pandas/core/frame.py", line 3645, in setitem self._set_item_frame_value(key, value) File "/usr/local/lib/python3.10/site-packages/pandas/core/frame.py", line 3787, in _set_item_frame_value arraylike = _reindex_for_setitem(value, self.index) File "/usr/local/lib/python3.10/site-packages/pandas/core/frame.py", line 11010, in _reindex_for_setitem raise TypeError( TypeError: incompatible index of inserted column with frame index
The corresponding lines were finding the first bug covered by each fuzzer on each benchmark, and adding it as a new column in the original data frame: https://github.com/google/fuzzbench/blob/2bc06d4aca2c8dae20eeca47b95433ef543f99a8/analysis/data_utils.py#L162-L164
The error complains that the size of the new column does not align with the dataframe. I don't remember the exact detail of these lines, but to further debug this, a simple way is to separate this complex statement into smaller ones, and print out the result matrix of each step.
E.g., is this a column mismatch in the result you pasted above?
INFO:root:experiment_df: ... [3 rows x 12 columns] WARNING:root:Filtered out invalid benchmarks: set(). ... [3 rows x 12 columns] Int64Index([0, 1, 2], dtype='int64') ... [3 rows x 13 columns]
Also, I would fix the 1st error and re-run the exp to check if the 2nd one still exists, just in case it is caused by the 1st.
/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2023-12-23-mua-measurer --fuzzers aflplusplus aflplusplus_407 --mutation-analysis
This failed because there is no fuzzer named aflplusplus_407
.
May be it was not included in this PR?
So presubmit error seems to be this:
"gsutil cp /tmp/tmplzfk1uuo gs://experiment-data/test-experiment/build-logs/benchmark-freetype2_ftfuzzer-mutation_analysis.txt" returned: 1. Extras:
output: ServiceException: 401 Anonymous caller does not have storage.objects.list access to the Google Cloud Storage bucket. Permission 'storage.objects.list' denied on resource (or it may not exist).
gs://experiment-data
should be the same bucket as in main, so I'm not sure what caused this ...
Regarding openh264, I have merged the main branch but I see now that the fix is not in there, let me add the fix from the pull request.
This failed because there is no fuzzer named aflplusplus_407. May be it was not included in this PR?
I just copied the command assuming those fuzzers existed in this branch, it doesn't really matter which fuzzer the experiment is run with, we can also just do one.
Thank you for the comments on the errors, we won't have time this weekend but will look into this starting Monday.
/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2023-12-23-mua-measurer --fuzzers aflplusplus --mutation-analysis
So presubmit error seems to be this:
"gsutil cp /tmp/tmplzfk1uuo gs://experiment-data/test-experiment/build-logs/benchmark-freetype2_ftfuzzer-mutation_analysis.txt" returned: 1. Extras: output: ServiceException: 401 Anonymous caller does not have storage.objects.list access to the Google Cloud Storage bucket. Permission 'storage.objects.list' denied on resource (or it may not exist).
gs://experiment-data
should be the same bucket as in main, so I'm not sure what caused this ...
This is because your statement requires gsutil
actions, but CI does not authenticate hence cannot perform that action.
Could that statement be replaced by mocks?
Experiment 2023-12-23-mua-measurer
data and results will be available later at:
The experiment data.
The experiment report.
This is because your statement requires gsutil actions, but CI does not authenticate hence cannot perform that action.
Could that statement be replaced by mocks?
How does this test work for the coverage measurer? As far as I can see there is no fixture setting up the data for the coverage measurer. We just added the mua commands (that require gsutil) there so that this test prepares the environment for the mua measurer correctly but it seems there is no similar setup needed for the coverage measurer. We can of course also try to use a mocking instead.
How does this test work for the coverage measurer?
Don't know, I would have to read the code to learn : )
Oh, I tried but I'm missing some context that is not apparent to me, so I thought I would ask.
I would have two more questions:
(BTW, would it be better to change the copyright info?)
Could you also please start another run:
/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2023-12-25-mua-measurer --fuzzers aflplusplus --mutation-analysis
And a update on the changes:
It seems the
corpus/
was not correctly defined by your fuzzer (BTW, would it be better to change the copyright info?). Here is an example from libfuzzer:
This test seems to pass the CI so it must just be a local environment problem. (We copied the coverage
"fuzzer" as we are adding a measurer I would expect there should be no need to set a corpus dir, it would not be used anyway.)
This is because your statement requires
gsutil
actions, but CI does not authenticate hence cannot perform that action.Could that statement be replaced by mocks?
I have now changed the test to only test the coverage part. Mocking seems a bit pointless, if we want to do more than just check if our two functions are called then I expect mocking to be quite brittle.
The fix for openh264 has now been cherry-picked into this branch.
The results of the mua evaluation are now also written to the experiment filestore.
do you mean that we should put ourselves as the copyright holders?
Yep, 1) because you wrote them, 2) It's 2023, not 2020 : )
- Is there a way to see the logs for the gcbrun?
There are two kinds of logs:
For 1, it's normally in the experiment data link (e.g., an example from a different exp). If you could write to this log, it will be the most convenient solution. For this experiment, they are missing because of fuzzer build failures.
For 2, we don't have a good way to publish it yet, because gcloud logs contain the information of private experiments. If you are looking for something, I can search, copy, and paste it for you (not very scalable, unfortunately).
Could you also please start another run:
/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2023-12-25-mua-measurer --fuzzers aflplusplus --mutation-analysis
Sure, will do below.
The results of the mua evaluation are now also written to the experiment filestore.
Ah, nice. Thanks!
/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2023-12-25-mua-measurer --fuzzers aflplusplus --mutation-analysis
presubmit failed (run make format
).
--- /home/runner/work/fuzzbench/fuzzbench/experiment/measurer/run_mua.py (original)
+++ /home/runner/work/fuzzbench/fuzzbench/experiment/measurer/run_mua.py (reformatted)
@@ -20,7 +20,6 @@
from common import experiment_utils
from common import new_process
from experiment.build import build_utils
-
logger = logs.Logger()
@@ -86,5 +85,5 @@
logger.info(f'mua_build_ids command: {docker_exec_command}')
mua_build_res = new_process.execute(docker_exec_command)
logger.info(f'mua_build_ids result: {mua_build_res}')
- build_utils.store_mua_build_log(mua_build_res.output, benchmark,
- fuzzer, cycle)
+ build_utils.store_mua_build_log(mua_build_res.output, benchmark, fuzzer,
+ cycle)
Experiment 2023-12-25-mua-measurer
data and results will be available later at:
The experiment data.
The experiment report.
do you mean that we should put ourselves as the copyright holders?
Yep, 1) because you wrote them, 2) It's 2023, not 2020 : )
Will update!
- Is there a way to see the logs for the gcbrun?
There are two kinds of logs:
1. The logs written by your _'fuzzer'_. 2. The logs written by gcloud.
For the dataframe/report issue the log should be written by gcloud, I added handling to collect the error and write it to experiment-data as well to make it easy.
Could you also please start another run:
/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2023-12-25-mua-measurer --fuzzers aflplusplus --mutation-analysis
Sure, will do below.
Seems there was an issue:
Step #1: Already have image: gcr.io/fuzzbench/experiment-runner Step #1: INFO:root:Command: ['-a', '--experiment-config', '/opt/fuzzbench/service/experiment-config.yaml', '--experiment-name', '2023-12-25-mua-measurer', '--fuzzers', 'aflplusplus', '--mutation-analysis']. Step #1: fatal: not a git repository (or any of the parent directories): .git Step #1: INFO:root:Failed to create instance. Command: ['gcloud', 'compute', 'instances', 'create', 'd-2023-12-25-mua-measurer', '--image-family=cos-stable', '--image-project=cos-cloud', '--zone=us-central1-c', '--scopes=cloud-platform', '--machine-type=n1-highmem-96', '--boot-disk-size=4TB', '--boot-disk-type=pd-ssd', '--metadata-from-file', 'startup-script=/workspace/tmpsis6no23'] failed. Output: ERROR: (gcloud.compute.instances.create) Could not fetch resource: Step #1: - The resource 'projects/fuzzbench/zones/us-central1-c/instances/d-2023-12-25-mua-measurer' already exists
I updated the experiment name, so I'm sorry to say, but I'm again not quite sure what this error message means.
Also sorry about the formatting issue, I though I had checked it. Should be fine now.
Seems there was an issue: I updated the experiment name, so I'm sorry to say, but I'm again not quite sure what this error message means.
Short summary:
libFuzzer
.Long explanation:
Step #7 - "aflplusplus-bloaty_fuzz_target-intermediate-runner": executor failed running [/bin/sh -c apt install -y unzip git gdb joe]: exit code: 100
Finished Step #7 - "aflplusplus-bloaty_fuzz_target-intermediate-runner"
ERROR
ERROR: build step 7 "gcr.io/cloud-builders/docker" failed: step exited with non-zero status: 1
However, this seems to be a problem with afl++ instead of your code, because benchmark-curl_curl_fuzzer_http-mutation_analysis
builds fine.
I will use libfuzzer to double-check.
/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2023-12-27-mua-measurer --fuzzers libfuzzer --mutation-analysis
Experiment 2023-12-27-mua-measurer
data and results will be available later at:
The experiment data.
The experiment report.
Thank you for looking into this. Some benchmarks still fail to compile for mutation analysis, we'll work on fixing those issues. We know that for example libxml2_xml works, is it possible to have a test run with only one benchmark?
I guess it would be this command:
/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2023-12-28-mua-measurer-only-libxml2_xml --fuzzers libfuzzer --benchmarks libxml2_xml --mutation-analysis
Also would it be possible to have a run without mutation analysis? If that one succeeds we can have the participants rebase their PRs.
/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2023-12-28-no-mua-measurer --fuzzers libfuzzer
/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2023-12-28-mua-libxml2-xml --fuzzers libfuzzer --benchmarks libxml2_xml --mutation-analysis
/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2023-12-28-no-mua-measurer --fuzzers libfuzzer
Experiment 2023-12-28-mua-libxml2-xml
data and results will be available later at:
The experiment data.
The experiment report.
Experiment 2023-12-28-no-mua-measurer
data and results will be available later at:
The experiment data.
The experiment report.
I modified the experiment name a bit because exp names must be within 30 chars and can only be `^[a-z0-9-]'.
Thank you! Ah, yes I forgot there was the 30 chars limit, thanks for fixing it.
Could it be that the 2023-12-28-mua-libxml2_xml
experiment was not correctly started, it seems there is no data at all?
For the no mua version we still need to fix that we do not build the mua containers if the flag is not given.
/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2023-12-28-mua-libxml2-xml --fuzzers libfuzzer --benchmarks libxml2_xml --mutation-analysis
Hmm strange. I will launch another one now and check for reasons later.
Could it be that the
2023-12-28-mua-libxml2_xml
experiment was not correctly started, it seems there is no data at all?
Ah, it's actually the naming problem. Fixed above.
I noticed many errors printed by this line.
And the docker container was not found:
{
"insertId": "19sqza2g152wo3y",
"jsonPayload": {
"experiment": "2023-12-28-mua-libxml2-xml",
"instance_name": "d-2023-12-28-mua-libxml2-xml",
"subcomponent": "measurer",
"component": "dispatcher",
"message": "Executed command: \"docker start mutation_analysis_libxml2_xml_container\" returned: 1.",
"output": "Error response from daemon: No such container: mutation_analysis_libxml2_xml_container\nError: failed to start containers: mutation_analysis_libxml2_xml_container\n"
},
"resource": {
"type": "gce_instance",
"labels": {
"instance_id": "8347721258806393535",
"project_id": "fuzzbench",
"zone": "projects/1097086166031/zones/us-central1-c"
}
},
"timestamp": "2023-12-27T23:24:25.149369689Z",
"severity": "ERROR",
"logName": "projects/fuzzbench/logs/fuzzbench",
"receiveTimestamp": "2023-12-27T23:24:25.149369689Z"
}
For the no mua version we still need to fix that we do not build the mua containers if the flag is not given.
Ah I see.
Alright, we finally got things working using our cloud account. I think we got through most problems, currently running into some memory limitations with the VMs we chose. So, I think we can try both experiments again.
/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2023-12-28-mua-only-libxml2-2 --fuzzers libfuzzer --benchmarks libxml2_xml --mutation-analysis
Also would it be possible to have a run without mutation analysis? If that one succeeds we can have the participants rebase their PRs.
/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2023-12-28-no-mua-measurer-2 --fuzzers libfuzzer
Also the startup dispatcher template does not escape the postgres password which is quite confusing when having a password with special characters. I fixed this in a commit, may be interesting for you as well.
/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2023-12-29-mua-only-libxml2 --fuzzers libfuzzer --benchmarks libxml2_xml --mutation-analysis
/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2023-12-29-no-mua-measurer --fuzzers libfuzzer
Experiment 2023-12-29-mua-only-libxml2
data and results will be available later at:
The experiment data.
The experiment report.
Experiment 2023-12-29-no-mua-measurer
data and results will be available later at:
The experiment data.
The experiment report.
Thank you for starting the runs again. The libxml2 run did not work for mua because of permission errors in the mapped directory, this is now fixed. Can't tell from the experiment data what went wrong with the no-mua-measurer, I'll start another run on my side.
One big remaining question: I tried to test if the dispatcher vm is preemptible by stopping and restarting the vm. From what I read this should simulate the preemption process. However, the dispatcher does not correctly resume with the error that the experiment already exists in the database. Did I test it incorrectly or does the dispatcher vm not need to be preemptible?
Also, I think that the reporting does not produce a report for single fuzzer runs, as the pandas dataframe bug (second bug mentioned here) seems to be caused by evaluating only one fuzzer.
Current status is that generally things should now work with mutation testing at least they do with our cloud config. However, only 8/23 default benchmarks work with mutation testing, this will require some more work. Also the mua results are not yet integrated into the report. If we can confirm that things now work we can look towards doing some real evaluations.
Could you start another run, I'm fairly confident that this one should finally work.
/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2023-12-29-mua-libxml2-two-f --fuzzers afl libfuzzer --benchmarks libxml2_xml --mutation-analysis
/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2023-12-30-mua-libxml2-two-f --fuzzers afl libfuzzer --benchmarks libxml2_xml --mutation-analysis
Experiment 2023-12-30-mua-libxml2-two-f
data and results will be available later at:
The experiment data.
The experiment report.
One big remaining question: I tried to test if the dispatcher vm is preemptible by stopping and restarting the vm. From what I read this should simulate the preemption process. However, the dispatcher does not correctly resume with the error that the experiment already exists in the database. Did I test it incorrectly or does the dispatcher vm not need to be preemptible?
The dispatcher is non-preemptible by default:
It is created with create_instance()
, which sets preemptible is bool = False
:
https://github.com/google/fuzzbench/blob/2bc06d4aca2c8dae20eeca47b95433ef543f99a8/common/gcloud.py#L47-L52
The variable preemptible
was not specified when creating the dispatcher. (If you'd like to change that, here is the spot):
https://github.com/google/fuzzbench/blob/2bc06d4aca2c8dae20eeca47b95433ef543f99a8/experiment/run_experiment.py#L558-L561
I see, so as preemptible
is not provided in the call, I guess you are running the the dispatcher as non-preemptible? For now I would be fine with supporting whatever you are currently doing, if that is good for you.
Also thank you for starting: 2023-12-30-mua-libxml2-two-f
, for some reason the mua part does not work. The build logs do not indicate an error. Could you do a search for error logs in that run?
I see, so as
preemptible
is not provided in the call, I guess you are running the the dispatcher as non-preemptible?
Yep, that's a feature of Python. In the function definition, preemptible is bool = False
means the default is False
when not specified.
Also thank you for starting:
2023-12-30-mua-libxml2-two-f
, for some reason the mua part does not work. The build logs do not indicate an error. Could you do a search for error logs in that run?
I saw many warning messages like the following. It's my first time seeing this message, did something consume all the space?
{
"insertId": "1de7f84fio04oh",
"jsonPayload": {
"message": "Failed to run \"google_optimize_local_ssd\" script: exec: \"google_optimize_local_ssd\": executable file not found in $PATH",
"localTimestamp": "2023-12-29T21:59:18.6696Z",
"omitempty": null
},
"resource": {
"type": "gce_instance",
"labels": {
"project_id": "fuzzbench",
"zone": "us-central1-c",
"instance_id": "1597173073505878006"
}
},
"timestamp": "2023-12-29T21:59:18.669691061Z",
"severity": "WARNING",
"labels": {
"instance_name": "d-2023-12-30-mua-libxml2-two-f"
},
"logName": "projects/fuzzbench/logs/GCEGuestAgent",
"sourceLocation": {
"file": "instance_setup.go",
"line": "117",
"function": "main.agentInit"
},
"receiveTimestamp": "2023-12-29T21:59:20.180046940Z"
}
No error was found:
BTW, there are a lot of these strange warns, not sure if they are related/helpful to your case:
It's strange because all messages are clearly from a different exp (e.g., r-2023-12-05-aflpp-2684587
must be an AFL++
experiment from 2023-12-05
).
I see, so as
preemptible
is not provided in the call, I guess you are running the the dispatcher as non-preemptible?Yep, that's a feature of Python. In the function definition,
preemptible is bool = False
means the default isFalse
when not specified.
Thank you for answering and your patience. Sorry for being unclear, I just wanted to make sure that we do not need to handle preemption as our code is also part of the dispatcher vm.
Also thank you for starting:
2023-12-30-mua-libxml2-two-f
, for some reason the mua part does not work. The build logs do not indicate an error. Could you do a search for error logs in that run?I saw many warning messages like the following. It's my first time seeing this message, did something consume all the space?
"message": "Failed to run \"google_optimize_local_ssd\" script: exec: \"google_optimize_local_ssd\": executable file not found in $PATH",
I see that one as well but I see it happen as part of the startup script, before anything "real" is happening with the vm. Does it happen more often for you? Memory could be an issue but I would have hoped for an error log at least.
No error was found:
Well that makes things complicated ... Can you search for "mua container", "mua_build_ids", and "mua_run_mutants" as well, those are the interactions I would most likely expect an error with.
It's strange because all messages are clearly from a different exp (e.g.,
r-2023-12-05-aflpp-2684587
must be anAFL++
experiment from2023-12-05
).
Yeah, definitely strange, I have no idea how we could have influenced this, and these messages did not happen during our test runs. Does this only happen for experiments from this PR branch or main as well?
I have pushed another fix that should not be related but maybe we get a more insightful log if we run another experiment:
/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2023-12-30-mua-xml2-2f-bug --fuzzers afl libfuzzer --benchmarks libxml2_xml --mutation-analysis
I have an idea what could be the problem. I noticed when something in runner.py crashes, not the fuzzer, that run behaves similar to what we can see in our problem run. Can you take a look at the runner logs?
/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2023-01-02-mua-xml2-2f-bug --fuzzers afl libfuzzer --benchmarks libxml2_xml --mutation-analysis
Experiment 2023-01-02-mua-xml2-2f-bug
data and results will be available later at:
The experiment data.
The experiment report.
I have an idea what could be the problem. I noticed when something in runner.py crashes, not the fuzzer, that run behaves similar to what we can see in our problem run. Can you take a look at the runner logs?
Sure, I saw many repeated errors like the following.
1. Preempted but is not running
Different from the strange ones above, this is r-2023-12-30-mua-libxml2-two-f-2708836
, a runner from the experiment we were running (2023-12-30
). There are similar messages for other runner instances, too.
However, somehow the instance name is d-2023-12-29-no-mua-measurer
. How can this report anything about the experiment on 2023-12-30
?
{
"insertId": "xaw0alfsugckp",
"jsonPayload": {
"instance_name": "d-2023-12-29-no-mua-measurer",
"message": "Instance: r-2023-12-30-mua-libxml2-two-f-2708839 is preempted but is not running.",
"component": "dispatcher",
"experiment": "2023-12-29-no-mua-measurer"
},
"resource": {
"type": "gce_instance",
"labels": {
"project_id": "fuzzbench",
"zone": "projects/1097086166031/zones/us-central1-c",
"instance_id": "8644718517907825376"
}
},
"timestamp": "2023-12-30T13:23:05.886594464Z",
"severity": "WARNING",
"logName": "projects/fuzzbench/logs/fuzzbench",
"receiveTimestamp": "2023-12-30T13:23:05.886594464Z"
}
2. Failed to run google_optimize_local_ssd
,google_set_multiqueue
Like the ones in earlier comments. I put their log entries together because they are similar.
{
"insertId": "14simwaf6cn7mw",
"jsonPayload": {
"localTimestamp": "2023-12-30T10:04:24.6398Z",
"omitempty": null,
"message": "Failed to run \"google_optimize_local_ssd\" script: exec: \"google_optimize_local_ssd\": executable file not found in $PATH"
},
"resource": {
"type": "gce_instance",
"labels": {
"instance_id": "7700821883207439863",
"zone": "us-central1-c",
"project_id": "fuzzbench"
}
},
"timestamp": "2023-12-30T10:04:24.639981903Z",
"severity": "WARNING",
"labels": {
"instance_name": "r-2023-12-30-mua-libxml2-two-f-2709035"
},
"logName": "projects/fuzzbench/logs/GCEGuestAgent",
"sourceLocation": {
"file": "instance_setup.go",
"line": "117",
"function": "main.agentInit"
},
"receiveTimestamp": "2023-12-30T10:04:28.090087564Z"
}
{
"insertId": "14simwaf6cn7mx",
"jsonPayload": {
"localTimestamp": "2023-12-30T10:04:24.6400Z",
"omitempty": null,
"message": "Failed to run \"google_set_multiqueue\" script: exec: \"google_set_multiqueue\": executable file not found in $PATH"
},
"resource": {
"type": "gce_instance",
"labels": {
"zone": "us-central1-c",
"project_id": "fuzzbench",
"instance_id": "7700821883207439863"
}
},
"timestamp": "2023-12-30T10:04:24.640440231Z",
"severity": "WARNING",
"labels": {
"instance_name": "r-2023-12-30-mua-libxml2-two-f-2709035"
},
"logName": "projects/fuzzbench/logs/GCEGuestAgent",
"sourceLocation": {
"file": "instance_setup.go",
"line": "117",
"function": "main.agentInit"
},
"receiveTimestamp": "2023-12-30T10:04:28.090087564Z"
}
3. No snapshot data A lot of these, not sure if they are expected though.
{
"insertId": "289kvffbspttu",
"jsonPayload": {
"experiment": "2023-12-30-mua-libxml2-two-f",
"component": "dispatcher",
"message": "No snapshot data.",
"instance_name": "d-2023-12-30-mua-libxml2-two-f"
},
"resource": {
"type": "gce_instance",
"labels": {
"zone": "projects/1097086166031/zones/us-central1-c",
"project_id": "fuzzbench",
"instance_id": "1597173073505878006"
}
},
"timestamp": "2023-12-30T09:32:24.138849121Z",
"severity": "WARNING",
"logName": "projects/fuzzbench/logs/fuzzbench",
"receiveTimestamp": "2023-12-30T09:32:24.138849121Z"
}
There are also many logging errors. I don't think I have seen them before either, but I feel they are probably from gcloud, not your code. Pasting a sample here for completeness.
{
"insertId": "qx14yheiqysz",
"jsonPayload": {
"component": "dispatcher",
"message": "Retrying on common.logs.log failed with 400 Log entry with size 2.28M exceeds maximum size of 256.0K. Raise.",
"subcomponent": "measurer",
"traceback": "Traceback (most recent call last):\n File \"/usr/local/lib/python3.10/site-packages/google/api_core/grpc_helpers.py\", line 65, in error_remapped_callable\n return callable_(*args, **kwargs)\n File \"/usr/local/lib/python3.10/site-packages/grpc/_channel.py\", line 1030, in __call__\n return _end_unary_response_blocking(state, call, False, None)\n File \"/usr/local/lib/python3.10/site-packages/grpc/_channel.py\", line 910, in _end_unary_response_blocking\n raise _InactiveRpcError(state) # pytype: disable=not-instantiable\ngrpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:\n\tstatus = StatusCode.INVALID_ARGUMENT\n\tdetails = \"Log entry with size 2.28M exceeds maximum size of 256.0K\"\n\tdebug_error_string = \"UNKNOWN:Error received from peer ipv4:216.239.38.174:443 {created_time:\"2024-01-01T19:15:59.905002192+00:00\", grpc_status:3, grpc_message:\"Log entry with size 2.28M exceeds maximum size of 256.0K\"}\"\n>\n\nThe above exception was the direct cause of the following exception:\n\nTraceback (most recent call last):\n File \"/work/src/common/retry.py\", line 81, in _wrapper\n result = func(*args, **kwargs)\n File \"/work/src/common/logs.py\", line 187, in log\n logger.log_struct(struct_message, severity=severity)\n File \"/usr/local/lib/python3.10/site-packages/google/cloud/logging_v2/logger.py\", line 209, in log_struct\n self._do_log(client, StructEntry, info, **kw)\n File \"/usr/local/lib/python3.10/site-packages/google/cloud/logging_v2/logger.py\", line 158, in _do_log\n client.logging_api.write_entries([api_repr])\n File \"/usr/local/lib/python3.10/site-packages/google/cloud/logging_v2/_gapic.py\", line 160, in write_entries\n self._gapic_api.write_log_entries(request=request)\n File \"/usr/local/lib/python3.10/site-packages/google/cloud/logging_v2/services/logging_service_v2/client.py\", line 706, in write_log_entries\n response = rpc(\n File \"/usr/local/lib/python3.10/site-packages/google/api_core/gapic_v1/method.py\", line 113, in __call__\n return wrapped_func(*args, **kwargs)\n File \"/usr/local/lib/python3.10/site-packages/google/api_core/retry.py\", line 349, in retry_wrapped_func\n return retry_target(\n File \"/usr/local/lib/python3.10/site-packages/google/api_core/retry.py\", line 191, in retry_target\n return target()\n File \"/usr/local/lib/python3.10/site-packages/google/api_core/grpc_helpers.py\", line 67, in error_remapped_callable\n raise exceptions.from_grpc_error(exc) from exc\ngoogle.api_core.exceptions.InvalidArgument: 400 Log entry with size 2.28M exceeds maximum size of 256.0K\n",
"experiment": "2023-12-30-mua-libxml2-two-f",
"instance_name": "d-2023-12-30-mua-libxml2-two-f"
},
"resource": {
"type": "gce_instance",
"labels": {
"instance_id": "1597173073505878006",
"zone": "projects/1097086166031/zones/us-central1-c",
"project_id": "fuzzbench"
}
},
"timestamp": "2024-01-01T19:16:00.047192290Z",
"severity": "ERROR",
"logName": "projects/fuzzbench/logs/fuzzbench",
"receiveTimestamp": "2024-01-01T19:16:00.047192290Z"
}
We can use the result/log of 2023-01-02-mua-xml2-2f-bug
to check if some of the errors above are flaky / from gcloud.
BTW, if you like, we can schedule a time to look at the logs together.
We recently published a paper describing an approach to use mutation analysis for fuzzer benchmarking. This pull request aims to support this approach (Phase I only) for FuzzBench.
There are a few mismatches between FuzzBench and our framework that would need some solutions, while I have some ideas, I would be interested in your input as well!
My understanding of measurers for FuzzBench is as follows: At the start of a run all docker images are built. Specifically, for the coverage measurer the
generated.mk
contains targets of the formbuild-coverage-{benchmark}
which build the images containing executables used for coverage collection. These make commands are started in thebuilder.py
, and after the image is built the coverage executables are extracted and stored on the host. The coverage executables are then used by the code in themeasure_manager.py
to measure unmeasured_snapshots, which I would expect to be newly created fuzzer queue entries. Coverage measurement happens on the host in a temporary directory as far as I can tell, with a timeout of 900 seconds for a snapshot.Open points: