Closed marcinguy closed 2 years ago
The same harness works very well with 8 Fuzzer Clients on 8 core server with 32 Gb ram (with both default llmp maps, small maps)
Memory usage
total used free shared buff/cache available
Mem: 31799 10787 2520 1462 18491 19084
Swap: 1023 838 185
I too have encountered memory consumption issues when running a large number of fuzz nodes. I'd be grateful for (and happy to review and test) any PRs.
FYI after 24h 50 fuzzers were spinning on 80 GB server. Next 24 hours only 25
Will monitor it. But I think/hope it will remain like this.
Crashed fuzzers got this message:
thread '<unnamed>' panicked at 'Fuzzer-respawner: Storing state in crashed fuzzer instance did not work, no point to spawn the next client! (Child exited with: 9)', /home/user/LibAFL-latest/libafl/src/events/llmp.rs:864:21
Yes, likely oom related, this happens for example if the child gets killed by the os without catchable signal.
Yes, this is all OOM related.
Some notes.
You can use cgroups to limit memory usage
recidivm could be useful to estimate program virtual memory usage
32 bit binaries have bugs using the fuzzer (on big map you can only attach dozen then it errors out, adjusting the map help to attach more)
32 bit binaries are much bigger than 64 bits
On VM and bare metal 32 bit binaries are way slower (why?)
ps_mem is a nice tool to examine memory usage
After 12 hrs run
398.8 MiB + 125.5 KiB = 398.9 MiB fuzzer_libxml2_noasan_broker [217779]
404.9 MiB + 295.5 KiB = 405.2 MiB fuzzer_libxml2_noasan_broker [215550]
401.4 MiB + 4.0 MiB = 405.4 MiB fuzzer_libxml2_noasan_broker [205428]
405.1 MiB + 453.5 KiB = 405.5 MiB fuzzer_libxml2_noasan_broker [216800]
407.5 MiB + 307.5 KiB = 407.8 MiB fuzzer_libxml2_noasan_broker [216175]
404.4 MiB + 3.9 MiB = 408.3 MiB fuzzer_libxml2_noasan_broker [204423]
407.3 MiB + 1.5 MiB = 408.9 MiB fuzzer_libxml2_noasan_broker [210015]
404.6 MiB + 4.3 MiB = 408.9 MiB fuzzer_libxml2_noasan_broker [203458]
408.8 MiB + 274.5 KiB = 409.1 MiB fuzzer_libxml2_noasan_broker [217292]
408.7 MiB + 738.5 KiB = 409.5 MiB fuzzer_libxml2_noasan_broker [213211]
408.9 MiB + 2.4 MiB = 411.3 MiB fuzzer_libxml2_noasan_broker [208499]
409.8 MiB + 1.5 MiB = 411.4 MiB fuzzer_libxml2_noasan_broker [209846]
411.4 MiB + 393.5 KiB = 411.7 MiB fuzzer_libxml2_noasan_broker [215359]
409.6 MiB + 3.0 MiB = 412.5 MiB fuzzer_libxml2_noasan_broker [211446]
412.1 MiB + 866.5 KiB = 412.9 MiB fuzzer_libxml2_noasan_broker [212659]
414.2 MiB + 213.5 KiB = 414.5 MiB fuzzer_libxml2_noasan_broker [217487]
415.5 MiB + 146.5 KiB = 415.7 MiB fuzzer_libxml2_noasan_broker [216870]
421.9 MiB + 128.5 KiB = 422.1 MiB fuzzer_libxml2_noasan_broker [217636]
422.0 MiB + 557.5 KiB = 422.6 MiB fuzzer_libxml2_noasan_broker [213713]
421.4 MiB + 2.3 MiB = 423.7 MiB fuzzer_libxml2_noasan_broker [207253]
415.5 MiB + 8.5 MiB = 424.0 MiB fuzzer_libxml2_noasan_broker [193133]
420.8 MiB + 5.1 MiB = 425.9 MiB fuzzer_libxml2_noasan_broker [199426]
424.9 MiB + 2.1 MiB = 427.0 MiB fuzzer_libxml2_noasan_broker [209765]
416.6 MiB + 10.6 MiB = 427.2 MiB fuzzer_libxml2_noasan_broker [191686]
423.6 MiB + 11.3 MiB = 434.9 MiB fuzzer_libxml2_noasan_broker [187469]
429.2 MiB + 9.1 MiB = 438.3 MiB fuzzer_libxml2_noasan_broker [191656]
427.9 MiB + 12.4 MiB = 440.3 MiB fuzzer_libxml2_noasan_broker [187356]
428.7 MiB + 13.1 MiB = 441.8 MiB fuzzer_libxml2_noasan_broker [186939]
444.2 MiB + 334.5 KiB = 444.6 MiB fuzzer_libxml2_noasan_broker [214894]
445.3 MiB + 809.5 KiB = 446.1 MiB fuzzer_libxml2_noasan_broker [213000]
453.0 MiB + 7.5 MiB = 460.5 MiB fuzzer_libxml2_noasan_broker [195286]
460.8 MiB + 249.5 KiB = 461.0 MiB fuzzer_libxml2_noasan_broker [217528]
462.5 MiB + 3.4 MiB = 466.0 MiB fuzzer_libxml2_noasan_broker [203741]
463.3 MiB + 4.0 MiB = 467.4 MiB fuzzer_libxml2_noasan_broker [203214]
428.5 MiB + 39.6 MiB = 468.1 MiB fuzzer_libxml2_noasan_broker [172821]
469.3 MiB + 8.2 MiB = 477.5 MiB fuzzer_libxml2_noasan_broker [192937]
491.8 MiB + 3.7 MiB = 495.4 MiB fuzzer_libxml2_noasan_broker [202383]
501.2 MiB + 192.5 KiB = 501.4 MiB fuzzer_libxml2_noasan_broker [217653]
436.1 MiB + 80.0 MiB = 516.1 MiB fuzzer_libxml2_noasan_broker [158913]
549.0 MiB + 176.5 KiB = 549.1 MiB fuzzer_libxml2_noasan_broker [217244]
592.1 MiB + 2.1 MiB = 594.2 MiB fuzzer_libxml2_noasan_broker [204922]
592.8 MiB + 2.8 MiB = 595.6 MiB fuzzer_libxml2_noasan_broker [201315]
599.1 MiB + 7.7 MiB = 606.8 MiB fuzzer_libxml2_noasan_broker [187959]
608.0 MiB + 131.5 KiB = 608.1 MiB fuzzer_libxml2_noasan_broker [216712]
613.5 MiB + 2.0 MiB = 615.4 MiB fuzzer_libxml2_noasan_broker [203154]
622.5 MiB + 7.5 MiB = 630.0 MiB fuzzer_libxml2_noasan_broker [186527]
673.3 MiB + 1.2 MiB = 674.5 MiB fuzzer_libxml2_noasan_broker [204792]
674.0 MiB + 1.2 MiB = 675.1 MiB fuzzer_libxml2_noasan_broker [205772]
676.1 MiB + 1.9 MiB = 678.0 MiB fuzzer_libxml2_noasan_broker [199879]
1.1 GiB + 24.5 MiB = 1.1 GiB fuzzer_libxml2_asan_nobroker_big [158235]
594.1 MiB + 649.5 MiB = 1.2 GiB fuzzer_libxml2_noasan_broker [153242]
13.0 GiB + 817.5 MiB = 13.8 GiB fuzzer_libxml2_noasan_broker [153055]
---------------------------------
38.9 GiB
After start each nonasan fuzzer was ca 1 GB, now (after 12 hrs) you can see one is 13 GB and other below 1GB. Any explanation for this @domenukk @andreafioraldi ?
Same harness for more the 48 hrs on 8 core box with 32 GB ram
616.4 MiB + 619.4 MiB = 1.2 GiB fuzzer_libxml2_asan [1141122]
1.4 GiB + 1.4 GiB = 2.8 GiB fuzzer_libxml2_asan [1697422]
1.4 GiB + 1.4 GiB = 2.8 GiB fuzzer_libxml2_asan [1601844]
1.6 GiB + 1.6 GiB = 3.2 GiB fuzzer_libxml2_asan [1694428]
1.7 GiB + 1.7 GiB = 3.4 GiB fuzzer_libxml2_asan [1690697]
1.7 GiB + 1.7 GiB = 3.4 GiB fuzzer_libxml2_asan [1690863]
1.7 GiB + 1.7 GiB = 3.5 GiB fuzzer_libxml2_asan [1686704]
1.8 GiB + 1.8 GiB = 3.6 GiB fuzzer_libxml2_asan [1691961]
1.9 GiB + 1.9 GiB = 3.8 GiB fuzzer_libxml2_asan [1683781]
BTW ASAN can slow down the binary ca 70% and cause memory usage increase up to 2x 3x ..... in my case speed is similar (weird?) and memory usage increased.
Maybe the above helps somebody.
I'm also seeing increasing memory usage on both asan and noasan over time. I think it has something to do with corpus size but I've not yet managed to put my finger on the issue.
One more that could be relevant
On 80 GB box
Running 50 fuzzers with 64 Mb map runs smoothly (all cores green, no syscalls, kernel threads from almost start) Running 50 fuzzers with 256 Mb map (70% red in fuzzers and 30% green after start and after). A lot of kernel threads.
This is big box, maybe related to CPU/cores.
@marcinguy can you share your libxml2 fuzzer? A zip here or drop a mail to andreafioraldi@gmail.com I'm going to debug this issue
Whatever this is, it is increasing. @s1341
After 64hrs
Shared mem total is at ca. 38 Gb
ipcs -m|awk '{ print $5}'|awk '{a+=$0}END{print a}'
38654705664
Fuzzer (I assume broker since it is the lowest PID) is at 42 Gb Ram usage.
21.0 GiB + 21.0 GiB = 42.0 GiB fuzzer_libxml2_noasan_broker
FYI After 10 hours with 50 nonasan fuzzers and 1 asan
It grew ca. 1 Gb (shared maps/pages)
Hmmm wondering why this happens only on the one fuzzer (I tbought broker... Lowest PID id, but then I also thought Broker just brokers and does not maps shared maps/pages. Or this is not the broker? )
Other processes stay at ca. 1Gb and this one (Broker?) 43 Gb
21.5 GiB + 21.6 GiB = 43.1 GiB fuzzer_libxml2_noasan_broker
ipcs -m|awk '{ print $5}'|awk '{a+=$0}END{print a}'
39661338624
FYI in all the tests before I was using clang-9 (my bad)
With a new harness/different target and clang-11, it looks way better (haven't observed the issue yet)
no-ASan vs ASan (1:3 mem usage ratio)
So suggest to use LibAFL Docker image (comes with clang-11)
Hey @marcinguy , can you try with your old setup again? Replace InMemoryCorpus with CachedOnDiskCorpus::new(..., 64)
in your fuzzer first.
@andreafioraldi your fixes resolved my issue. I think it would be nice, however, to have an InMemoryCorpus with a maximum size, so that it becomes a FIFO when maximum is reached (or something similar).
@s1341 @andreafioraldi Great it worked. Will check it by the next possibility. Also, try to use recent clang/llvm (now 11), noticed some issues with clang-9/llvm-9. Now with clang-11/llvm-11 it works well.
Feel free to close this issue.
Is your feature request related to a problem? Please describe.
Having a several dozen of cores server I have noticed that running more than, in my case, 50 fuzzers (80 GB ram, 64 cores, 256 threads server) causes either OOM (broker gets killed along with some fuzzer clients) or broker crash (too many messages to process?)
Broker crashes here when I reduce map size to small maps, seems like due to too many messages:
https://github.com/AFLplusplus/LibAFL/blob/939784d5121abc57650ce8eb094c399dc551912e/libafl/src/bolts/llmp.rs#L2263
Broker gets killed after a while, along with other clients (OOM) when more than 50 fuzzers
With 50 fuzzer it seem to have a dozen gigs of free Ram
Seems to run stable.
With 100, 150, 200, 255 it OOMs or broker crashes sooner or later (minutes to hours)
Describe the solution you'd like
Make clients don't send that many messages that crashes Broker when using non default 256 mb maps i.e small maps
Somehow make Fuzzer Client use less memory (I guess this is already achieved via small maps, configuration of it)
Describe alternatives you've considered
Tried small maps, which increased message amount
Additional context
I can try to optimize map size/message amount by adjusting it from 256 to 128, 64 etc. Maybe this would allow to double, quatrify etc Fuzzer Clients amount
Also, Tried setting this to 1, didn't help
https://github.com/AFLplusplus/LibAFL/blob/5a722994acb69d0f03bce101b29ea14681d1d8b3/libafl/src/bolts/llmp.rs#L102
Also this to 1 ms, didn't help
https://github.com/AFLplusplus/LibAFL/blob/d8ef1dd90abcbadd3d63c17be9ae669eead9241f/libafl/src/events/llmp.rs#L149