run_experiment takes longer in GCP when generating report

sykweon commented 10 months ago

It seems like run_experiment in GCP takes longer to generate a report. It has been more than 5+ hours since I started executing following command:

PYTHONPATH=. python3 experiment/run_experiment.py \
--experiment-config experiment-config.yaml \
--benchmarks lcms_cms_transform_fuzzer \
--experiment-name $EXPERIMENT_NAME \
--fuzzers aflplusplus

for 2 trials, each of them lasting for 30 minutes, but it hasn’t finished generating report data. Running the same command and configuration in the local experiment already terminated.

The run_experiment is looping in this loop: https://github.com/google/fuzzbench/blob/ba22647276a63912312fb6636287cc266da19683/experiment/dispatcher.py#L166.

It seems like the processes are sleeping for the majority of time. htop output also shows that measuring processes are getting scheduled, but there’s less than 1% usage of CPU for each scheduled process.

Is there any solution to make it generate report faster?

run_experiment script also steadily outputs this warning in case it is relevant.

/usr/local/lib/python3.10/site-packages/jinja2/runtime.py:298: FutureWarning: this method is deprecated in favour of `Styler.to_html()`
  return __obj(*args, **kwargs)
/work/src/analysis/plotting.py:159: FutureWarning: 

The `ci` parameter is deprecated. Use `errorbar=('ci', 95)` for the same effect.

  axes = sns.lineplot(
/work/src/analysis/plotting.py:159: FutureWarning: 

The `ci` parameter is deprecated. Use `errorbar=('ci', 95)` for the same effect.

  axes = sns.lineplot(
/usr/local/lib/python3.10/site-packages/seaborn/matrix.py:202: RuntimeWarning: All-NaN slice encountered
  vmin = np.nanmin(calc_data)
/usr/local/lib/python3.10/site-packages/seaborn/matrix.py:207: RuntimeWarning: All-NaN slice encountered
  vmax = np.nanmax(calc_data)
in the loop DEBUG
/usr/local/lib/python3.10/site-packages/jinja2/runtime.py:298: FutureWarning: this method is deprecated in favour of `Styler.to_html()`
  return __obj(*args, **kwargs)
/work/src/analysis/plotting.py:159: FutureWarning: 

The `ci` parameter is deprecated. Use `errorbar=('ci', 95)` for the same effect.

  axes = sns.lineplot(
/work/src/analysis/plotting.py:159: FutureWarning: 

The `ci` parameter is deprecated. Use `errorbar=('ci', 95)` for the same effect.

  axes = sns.lineplot(
/usr/local/lib/python3.10/site-packages/seaborn/matrix.py:202: RuntimeWarning: All-NaN slice encountered
  vmin = np.nanmin(calc_data)
/usr/local/lib/python3.10/site-packages/seaborn/matrix.py:207: RuntimeWarning: All-NaN slice encountered
  vmax = np.nanmax(calc_data)

adamstorek commented 10 months ago

Facing the same issue here!

chinggg commented 10 months ago

I also found that FuzzBench does more stuff when running on GCP, which makes it less robust. I cannot see logs from screen but have to visit GCP console to see log reporting, but I still cannot find the reason why FuzzBench stops working on GCP.

(I failed to run FuzzBench by setting a GCP project due to the error in #1911, so I am actually running local experiment on GCP VM instances)

google / fuzzbench

run_experiment takes longer in GCP when generating report #1915