Angora is a mutation-based fuzzer. The main goal of Angora is to increase branch coverage by solving path constraints without symbolic execution.
tremendous queue files when fuzzing pdftotext

zjuchenyuan commented 5 years ago

In short, when fuzzing pdftotext, Angora generates tremendous amount of queue files, making analyze queue coverage infeasible.

Here is the detailed experiment setup:

The pdftotext is compiled using xpdf-4.00. Parallel running 30 independent instance, each bind to a certain CPU using cpuset, limit memory to 2GB, and want to fuzz 24 hours. The angora image is based on your Dockerfile, but change the config file Angora/common/src/ to set MAX_INPUT_LEN to 1MB. (so I call it angora_no15k)

ST=0; for i in `seq 1 1 30`; do name=no15kangora5_$i; docker run -d -v /d:/d --cpus 1 -m 2g --memory-swap 3g --privileged --name $name --cpuset-cpus `echo $ST + $i|bc` --env ANGORA_DISABLE_CPU_BINDING=1 angora_no15k timeout -k 5 86400 /angora/angora_fuzzer --input /d/seed/pdf --output /d/output/$name -M 2048 -t /d/p/angora/new/pdftotext/taint/pdftotext -- /d/p/angora/new/pdftotext/fast/pdftotext @@ ; done

But most of the instances exited before 24 hours due to memory limit (being killed), running time are given below, only 4 of 30 finished successfully.

            echo $(echo $(date --date=`d inspect $1 -f '{{.State.FinishedAt}}'` +%s) - $(date --date=`d inspect $1 -f '{{.State.StartedAt}}'` +%s)|bc)
for i in {1..30}; do echo \|$i\|`dt no15kangora5_$i`\|`ls no15kangora5_$i/queue | wc -l`\|; done
instance running time (seconds) queue files
1 31101 21499
2 31978 16306
3 29649 19431
4 38743 24900
5 26747 17349
6 5768 3510
7 14953 12600
8 1649 260
9 28694 18133
10 26808 18135
11 1954 551
12 1869 383
13 45833 20944
14 86400 27122:point_left:
15 2910 993
16 929 116
17 69041 40026
18 86400 36907:point_left:
19 46064 24716
20 41368 18632
21 28431 17855
22 39230 22507
23 86401 34340:point_left:
24 86400 33947:point_left:
25 32395 19019
26 34445 22120
27 2619 607
28 1652 305
29 9987 9931
30 787 162

You can see the queue files are too many, which makes it hard to analyze coverage using afl-cov.

Besides, density too large warning:

 WARN  angora::stats::chart       > Density is too large (> 10%). Please increase `MAP_SIZE_POW2` in `llvm_mode/config.h` and `MAP_LENGTH` in `common/src/`. Or disable function-call context by compiling with `ANGORA_DISABLE_CONTEXT=1` or `ANGORA_DIRECT_FN_CONTEXT=1` environment variable.

Do I need to follow this warning to change the code? If I do, does this change impact fuzzing performance when fuzzing other programs?

Any suggestions? How to correctly using Angora to fuzz pdftotext? (I know I need to rerun this fuzzing experiment without memory limit

spinpx commented 5 years ago

only 4 of 30 finished successfully.

Why did they fail?

Do I need to follow this warning to change the code? If I do, does this change impact fuzzing performance when fuzzing other programs?

My suggestion: If the density is > 50%, try to disable context. If the density is 10%~50%, try to modify MAP_SIZE.

spinpx commented 5 years ago

Considering there are tremendous queue files, you 'd better disabling context.