AFLplusplus / Grammar-Mutator

A grammar-based custom mutator for AFL++
Apache License 2.0
223 stars 18 forks source link

Progress Report #1

Closed vanhauser-thc closed 4 years ago

vanhauser-thc commented 4 years ago

Hi Shengtuo,

please put your weekly progress reports in this issue. Besides, I have not seen any progress the last 5 days. Do you need assistance?

h1994st commented 4 years ago

Hi @andreafioraldi ,

Definition: the size of a tree is defined as the total number of non-terminal nodes in a tree

Why the generator prefers to generate escc tokens? Is it a bug in f1?

In 100 seeds, if I grep for \t I have 50 lines, just 2, if I grep for W. Seems reasonable that nonterminals have more priority (is it? maybe not) but not at this level, basically terminals that are not in the last levels of the grammar seem almost never generated.

Actually, this may not be a bug.

The primary goal of the generation algorithm is to maximize the total number of nodes in the generated tree (i.e., generating a more complex tree). Therefore, if the budget (i.e., max_len) is enough, non-terminal nodes are always more preferred so that the generated tree can be much larger.

Those terminal characters in <character> token are much cheaper than <esc> token. During the generation, picking <esc> can help expand the size of the generated tree.

I observed some jsons created with ./grammar_generator and I noted that the generator prefers to go in deep in the grammar. You can see that most of the generated files have a lot of whitespaces and escape chars like \t \f etc.

You showed a good point.

I guess, the reason why you observed a lot of white spaces may come from the input grammar: <ws> token occupies most of the generation budget (max_len). Since whitespace does not matter in JSON, it would be more reasonable to remove <ws> in the JSON grammar file or give <ws> a smaller weight.

The grammar file can affect a lot. Just committed a new grammar file (JSON w/o whitespace).

Haven't tested the coverage using this grammar file, but the generated seeds look more reasonable and complex:

// big number
-16611663214665861469943525663718284866987616455132979276615466158.4556936679312359171979365294822269164346235545398829929145267919326342152294182569791319511366976838312297194825366971894734956576484375759474326193824282497245683278644285427662442484411595437344872428167131873396736265245245467265274575332298557739376138998211676419324146154224417661733432619388123539639230416E-4627676246482697232919216986327172259868961448913446846654212599214739261974265967561666115331851314264222417076957526116559697311349655332391345518675297343714151836725511696958183116944472864776225452716159481379616469975248169262722884969784792894158396436627337535963189422094587460

// nested object
[[{"4":true,"":{},"":false},-772186789964342995409687.562986534682831146212478511377453853351582677677331765317128386725494690E56373771940000,"\"\r\b1\t\b","s",{},null,false,[null],[[{},"",[],false,true],true,true,true],null,[[false],{}],true,{"":{}},{"\t\\\n\"\\\f\f\f\\\t\r\b\\\f\\\r\t\rO":{"\t\n\f\r.#":"_","":{}},"":null},null,30.2300,305625200E+70,null,{},{},{"N":{"\nx":{},"":false,"":[],"":true},"":""},{"":true},{},true,[null,true,true],{},"f",false,[-700.4e+09,"]"],{},{},false,null,[],[],null,{},{},true,{},"\t\n","\t\r\"\n\f\"\\ ",{},false,[[],[]],[],0,null,{"\\\"\fkk":{"$":false,"":{},"":true},"":null,"":{},"":true,"":{},"k":false,"Cs":"|",":":{},"":[]},[],"",[],[],-00,"",false,true,"",false,-650.6047E6,[],[],{},"5",{},{},[],[false],null,true],"\"\f\n\n\\\\\f\b\t\r\b\t\t\b\r\r\b\r\t\f\t\f\"\b\f\r\t\t\\\r\"\b\t\f\n\t\\\b\n\b\b\r\f\n\nI\t\t\n\t\\\r\b\n\"\r\b\t\"\n\"\f\r\"\f\b\b\"\t\f\"\f\rO\\","\b4",[],-850.2708216220100E-0,false,{},"\f\\\b\t\t\r\\\f\rms",false,false,null]

// strange string
"\"\f\n\f\f\t\r\b\r\\\\\f\\\n\"\n\\\f\r\f\b\"\r\\\\\f\\\n\n\r\n\"\t\b\"\"\r\r\"\r\n\\\\\\\t\b\\\f\r\t\b\n\"\t\r\f\f\f\b\t\f\"\"\n\f\\\b\"\\\n\"\\\r\"\b\\\\\f\r\r\t\"\n\f\"\r\n\r\\\f\r\n\t\n\\\t\r\n\r\n\\\f\b\\\n\n\"\"\\\\\f\t\f\f\n\n\"\"\t\t\n\n\r\f\b\n\n\f\r\t\b\n\f\n\b\b\f\n\f\b\n\t\b\n\b\"\b\r\f\r\t\b\t\"\\\b\t\t\n\"\f\b\t\\\\\t\n\b\b\t\n\r\b\"\r\r\r\t\f\b\f\f\t\"\t\"\b\f\n\\\t\\\f\r\"\"\b\b\\\n\n\"\r\t\f\b\b\t\f\b\t\r\f\f\f\"\f\n\n\t\b\f\f\f\n\\\"\r\f\"\\\t\r\r\"\t\r\f\\\"\"\"\f\f\n\\\n\f\\\"\b\\\r\n\\\t\n\r\b\\\\\r\r\r\"\b\r\n\t\b\f\r\f\t\"\\\\\f\t\r\n\n\f\b\n\b\f\n\b\f\\\f\"\b\r\"\t\n\t\\\n\f\"\r\b\\\n\r\\\t\n\f\r\b\\\f\f\r\f\"\r\r\\\f\b\n\b\t\f\n\r\\\t\r\b\\\b\"\\\b\\\\\\\f\r\n\r\b\\\b\t\r\n\"\b\\\f\\\r\b\b\"\n\n\t\t\t\f\n\t\r\b\t\n\\\f\r\t\\\f\"\f\t\r\"\r\"\b\b\r\"\n\n\n\r\"\r\f\n\\\r\t\t\n\b\\\\\"\"\n\"\f\b\r\b\\\f\"\b\"\f\n\b\b\f\b\"\b\r\t\b\"\b\t\\\\\f\t\"\\\r\t\\\"\r\r\r\t\b\n\b\b\r\"\b\n\n\n\t\b\\\"\f\b\"\\\b\n\r\r\n\f\f\\\"\r\n\f\b\b\f\f\b\"\b\\\f\r\n\t\b\t\b\"I\f5\b\t"
h1994st commented 4 years ago

Just a quick update on the coverage

The results of the grammar mutator seem still not good enough.

Still need to dig more on the mutation strategies. Tomorrow, I will work on implementing rules_mutation (in tree_mutation.c) and integrating afl_custom_fuzz_count to improve the mutation part.

w/ grammar mutator

start_time        : 1598421935
last_update       : 1598422024
run_time          : 88
fuzzer_pid        : 17172
cycles_done       : 3
cycles_wo_finds   : 0
execs_done        : 300019
execs_per_sec     : 3406.29
execs_ps_last_min : 3222.35
paths_total       : 273
paths_favored     : 23
paths_found       : 173
paths_imported    : 0
max_depth         : 5
cur_path          : 201
pending_favs      : 0
pending_total     : 187
variable_paths    : 0
stability         : 100.00%
bitmap_cvg        : 50.66%
unique_crashes    : 0
unique_hangs      : 0
last_path         : 1598422016
last_crash        : 0
last_hang         : 0
execs_since_crash : 300019
exec_timeout      : 20
slowest_exec_ms   : 0
peak_rss_mb       : 0
cpu_affinity      : 0
edges_found       : 154
var_byte_count    : 0
havoc_expansion   : 0
afl_banner        : test_json
afl_version       : ++2.67d
target_mode       : shmem_testcase default
command_line      : afl-fuzz -E 300000 -s 0 -i in2 -o out2 -- ../../../targets/json-parser/test_json @@

w/o grammar mutator

start_time        : 1598422034
last_update       : 1598422075
run_time          : 40
fuzzer_pid        : 6168
cycles_done       : 0
cycles_wo_finds   : 0
execs_done        : 300005
execs_per_sec     : 7318.44
execs_ps_last_min : 0.00
paths_total       : 275
paths_favored     : 64
paths_found       : 175
paths_imported    : 0
max_depth         : 4
cur_path          : 252
pending_favs      : 11
pending_total     : 215
variable_paths    : 0
stability         : 100.00%
bitmap_cvg        : 46.38%
unique_crashes    : 0
unique_hangs      : 0
last_path         : 1598422075
last_crash        : 0
last_hang         : 0
execs_since_crash : 300005
exec_timeout      : 20
slowest_exec_ms   : 0
peak_rss_mb       : 0
cpu_affinity      : 0
edges_found       : 204
var_byte_count    : 0
havoc_expansion   : 0
afl_banner        : test_json
afl_version       : ++2.67d
target_mode       : shmem_testcase default
command_line      : afl-fuzz -E 300000 -s 0 -i in2 -o afl-out -- ../../../targets/json-parser/test_json @@
andreafioraldi commented 4 years ago

50.66% with grammar vs 46.38% without, seems good to me, you are starting normal afl with a good corpus and json is not super complex.

andreafioraldi commented 4 years ago

do benchs on mruby that is quite more complex

h1994st commented 4 years ago

Hi @vanhauser-thc and @andreafioraldi ,

I just added documents for the project and integrated afl_custom_fuzz_count. The project is ready for review now. Please let me know if you have any suggestions!

I will try to resolve your comments in the rest of this week, if any.

Besides, is it possible for us to make the repository public? So that I can submit the link to this repo for the GSoC final evaluation.

andreafioraldi commented 4 years ago

Besides, is it possible for us to make the repository public? So that I can submit the link to this repo for the GSoC final evaluation.

Yes ofc

andreafioraldi commented 4 years ago

For the blob mutations, borrow this function https://github.com/AFLplusplus/AFLplusplus/blob/stable/examples/custom_mutators/custom_mutator_helpers.h#L24

It does not change the size of the mutated input part.

andreafioraldi commented 4 years ago

The repo is now public

h1994st commented 4 years ago

For the blob mutations, borrow this function https://github.com/AFLplusplus/AFLplusplus/blob/stable/examples/custom_mutators/custom_mutator_helpers.h#L24

It does not change the size of the mutated input part.

I will try it later. Thanks!

h1994st commented 4 years ago

May close this issue, as GSoC ends.