fuzzing: __AFL_LOOP constant is not mysterious

cjb commented 4 months ago

In this part of the handbook: https://github.com/trailofbits/testing-handbook/blob/4945bfa0273a3cce16a25434105cc3495542d64f/content/docs/fuzzing/c-cpp/11-aflpp/index.md?plain=1#L462

Wrap the target code being executed inside a while loop. The used number of iterations should be reasonably large. The values 1k > and 10k are most often used. There is no official guidance on how this constant affects fuzzing performance.
while (__AFL_LOOP(1000)) {
   size_t len = strlen(input_buf);
   check_buf(input_buf, len);
}

The official guidance is here, https://github.com/AFLplusplus/AFLplusplus/blob/775861ea94d00672c9e868db329073afd699b994/instrumentation/README.persistent_mode.md?plain=1#L151

The numerical value specified within the loop controls the maximum number of iterations before AFL++ will restart the process from scratch. This minimizes the impact of memory leaks and similar glitches; 1000 is a good starting point, and going much higher increases the likelihood of hiccups without giving you any real performance benefits.

So, the effect on fuzzing performance is that you incur one process restart (which previously happened after every iteration, before you turned on persistent mode) every N execs.

maxammann commented 4 months ago

Yeah I agree we should be more precise on this one.

The reason why I wrote it kind of vague is that I don't have numbers on it, and I feel it does not matter much for most people getting started. Stability likely won't be problematic between 1k and even 100k, but that depends ofc a lot on the target. I'm kind of surprised that they did not see performance benefit above 1k iterations.

Maybe a section about "fuzzing stability" would make sense that discusses the factors behind this "iteration count". In general stability of fuzzing (no memory leak, no global state) is also important for libFuzzer, LibAFL and other fuzzers, but it is potentially poorly addressed there because it is mostly in-memory fuzzing.

maxammann commented 4 months ago

Apparently larger is better :D https://github.com/rust-fuzz/afl.rs/issues/433

That was also kind of my feeling. Only choose smaller values if your target is unstable.

cjb commented 4 months ago

Thanks.

Apparently larger is better :D https://github.com/rust-fuzz/afl.rs/issues/433

I think this is for fuzzing Rust code though -- afl.rs, not libafl. The "correct" value is a heuristic around the likelihood of memory leaks inside the target's loop iteration that the average fuzzing user will not want to fix themselves. The memory leaks are less likely in Rust than in C, which is why afl.rs's constant can be higher than aflplusplus's.

I'm kind of surprised that they did not see performance benefit above 1k iterations.

The performance benefit is entirely predictable for a given target, right? If you went from 100 execs/sec to 1000 execs/sec through turning on persistent mode with no constant, then you were previously spending 90% of your time (9ms per restart 100 execs) in process restart. If you use _AFL_LOOP(1000), you will now have to spend 0.9% of your time (9ms 1 exec) in process restart. If you use _AFL_LOOP(10000), it'll be 0.09%. I suppose that when they say they didn't see performance benefit, they just mean that you're already only sacrificing around 1% perf (depending on how fast each iteration is) to process restarts at _AFL_LOOP(1000), so the perf gain you'll see from increasing the number higher than 1000 could never be more than 1%.

maxammann commented 2 months ago

Addressed in this commit: https://github.com/trailofbits/testing-handbook/pull/38/commits/0e3836c337d8e835a41df7dc11945909b2a615f5

trailofbits / testing-handbook

fuzzing: __AFL_LOOP constant is not mysterious #18