Closed vanhauser-thc closed 4 years ago
Hi Marc,
During the past week, I mainly focused on the serialization/deserialization part of the grammar mutator, because the grammar mutations require tree representations rather than the serialized string format.
I did have some questions about the fuzzing queue and the grammar mutation. In AFL++, the queue entry in the fuzzing queue only stores the file path of the test case. For each fuzzing round (i.e., fuzz_one
/ fuzz_one_original
), AFL++ reads the test case from the file. If we want to integrate the grammar mutation, we need to keep the tree representation for each test case. Currently, I have two solutions in my mind:
afl_custom_fuzz
) and trimming (afl_custom_trim
) extra_data
field for the queue_entry
so that we can save the parsing tree in the memoryafl_custom_fuzz
so that we can pass extra_data
to the grammar mutator and retrieve the mutated parsing tree (extra_data_out
)Solution 1 is intuitive and does not require any modifications to AFL++. The only concern is that we need to deserialize/parse the test case for every custom mutation and trimming.
Solution 2 may be more efficient, as it avoids lots of redundant deserialization/parsing operations. The drawback is that we need to modify AFL++ a little bit.
My current plan is to first implement "Solution 1" and then grammar mutations. Do you think "Solution 2" would be beneficial? If so, we can refactor the codebase after we have a prototype (Solution 1) for the grammar mutator.
Thank you!
If you have less time than expected then please update your plan to see if you have to remove features, and when which milestones will be realistically reached. There is more in life than GSOC, though we must be able to see if you make progress, if the plan is working and where we can help.
Solution 1 sounds very costly and inefficient. There is a custom mutator hook that is run when a new queue entry is added. you could use that to perform the deserialize/parse, then you would to have to do that only once for each queue entry - and writing it away, e.g. into the queue filename + ".tree" or some other extension added. Just load that ... .tree file when the custom mutation is started.
Hi @h1994st , drop solution 1. Solution 2 is ok, btw you can also do it storing the data in an external way. Look at https://github.com/AFLplusplus/AFLSmart-Custom-Mutator/blob/master/custom_mutator.cpp#L7. Here I store the parsed tree in a map which the key is the name of the testcase. This has the same effect of adding an extra field to queue entry but does not require modifications in AFL++. Modifications in afl-fuzz are not a problem, btw is hard to have an extra field that works with multiple custom mutators.
Modifications in afl-fuzz are not a problem, btw is hard to have an extra field that works with multiple custom mutators.
especially if you then have two custom mutators using that same field ...
so better save it in an extra file like andrea is doing in his aflsmart solution but perform the deserialization in a hook for queue_new_entry
@vanhauser-thc @andreafioraldi OK, I see. I will try Andrea's approach. Thank you!
@h1994st please commit on a daily basis so we see the progress being made. Now the last commit was three days ago and it is unclear for me if you are working full time every minute of being awake or not at all or something in between ... :)
@vanhauser-thc Got it. I will keep it in mind.
Hi @vanhauser-thc and @andreafioraldi ,
Here is the list of completed tasks during this week
json_custom_mutator.cpp
to store generated tree structures while storing interesting test cases in the queuegoogletest
as the testing framework & Add corresponding test cases for implemented functionalitiesTODO for next week:
json_c_fuzz.c
to annotate recursive rules (prerequisite for "random recursive mutation" and "recursive minimization")Valgrind
memcheck as a testing taskKeep going, good work!
Thank you! Looks like you made good progress :) Two things: 1) I think we should do a phone call again. please propose a few times that fit you. 2) when would be a good milestone for us to test the features you are working on?
Hi @vanhauser-thc and @andreafioraldi,
Thanks!
Here are my available time slots:
Please let me know if there are any conflicts.
For the milestone, after finishing all mutation and trimming strategies (i.e., after July 17), you should be able to test the grammar mutator, at least for fuzzing JSON parser. I will notify you guys once I finish it.
I would prefer July 10 at 10am<->4pm, otherwise 11 or 12 July. @andreafioraldi ?
12 july for me
Then July 12 10am Shengtuo's and 4pm our time.
Hi @vanhauser-thc and @andreafioraldi ,
Will we use the same Google Meet link as last time?
Thanks!
Got it!
Hi @vanhauser-thc and @andreafioraldi ,
Here is the list of completed tasks during this week
tree.c
to calculate recursion edges in a treef1_c_gen.py
to generate function array and node typesValgrind
memcheck as a testing taskTODO for next week:
Hi @vanhauser-thc and @andreafioraldi ,
As we mentioned before, around July 17, you can have a try on the grammar mutator prototype. At least, it now works for the JSON parser example described in README.md. For now, the grammar mutator has
I need to refactor my codes to implement another trimming strategy "subtree minimization". The current grammar mutator does not parse the initial test cases but generates its own test cases at the beginning. I will take care of the parsing part later.
Please let me know if you have any feedback on the performance, implementation details, etc. :) As usual, I will give you a completed progress update at the end of this week.
BTW:
While running the grammar mutator, I found that the mutated tree may represent an empty string, which makes the fuzzer throw an error. Therefore, I submitted a pull request to allow the fuzzer to run the target on an empty test case. Not sure if this is a good solution.
Hi @vanhauser-thc and @andreafioraldi ,
For this week, mainly focus on:
TODO for next week:
Hi @vanhauser-thc and @andreafioraldi ,
For this week, mainly focus on:
trees
directory in the output directoryid
, subnode_count
, val_len
, and val
afl-fuzz
with the grammar mutator on json-parser
TODO for next week:
xxx_c_fuzz.c
as a building step in the compilation
Hi @vanhauser-thc and @andreafioraldi ,
For this week, mainly focus on:
xxx_c_fuzz.c
as a building step in both Makefile and CMakeTODO for next week:
f1_c_gen.py
to replace pool_of_strings
with pool_of_trees
std::string
with C string and related functionsgoogletesting
with another C-based testing frameworkBTW, how do you guys often benchmark afl-fuzz
? The current fuzzing speed is a little bit slow. I want to identify the bottleneck and then optimize the performance.
BTW, how do you guys often benchmark
afl-fuzz
? The current fuzzing speed is a little bit slow. I want to identify the bottleneck and then optimize the performance.
depends on what you want to benchmark it on - speed, coverage, ...
for speed comparison I copy the selected versions and do something like
TIME=125
afl-system-config
for fuzzer in afl-fuzz-1 afl-fuzz-2; do
for run in `seq 1 20`; do
afl-fuzz -V$TIME -i in -o out-$fuzzer-$run -d -s 123 -b 1 -- ./target
done
done
(note that -b requires current dev, binds to cpu #1 (the second core/thread).) the first runs usually have higher exec counts that the following as there the CPU is still cool, so disregard them and see where the other stabilize approx and then compare the the different fuzzers. There compare the execs_done or execs_per_sec in $out/fuzzer_stats
for coverage you can do the same but there switch -V for -E (and a value that lets the fuzzer run for a useful time, e.g 10 minutes) and do only one run. -E is for amount of execs (however is just an approx amount) instead of runtime. for easy comparison you can compare the paths_found values in $out/fuzzer_stats, if you want it more complete than compare edges discovered (requires afl-clang-lto (llvm 11+) or afl-clang-fast (llvm 9+) with default instrumentation:
for file in out-$fuzzer/queue/id*; do
cat $file | afl-showmap -o - -q -- ./target
done | sed 's/:.*//' | sort -u > EDGES.$fuzzer
For performance use gprof or perf
@vanhauser-thc @andreafioraldi Thanks for your suggestions. I will try the approaches you mentioned here.
Hi @vanhauser-thc and @andreafioraldi ,
For this week, mainly focus on:
f1_c_gen.py
to replace pool_of_strings
with pool_of_trees
std::string
with C string and related functionsstd::map
and std::vector
in the chunk store with a C map (rxi/map) and the linked listTODO for next week:
perf
results from test_custom_mutator
, the chunk store, used in the splicing mutation, is a bottleneck. This is due to the hierarchy structure of the tree. In this case, for complex test cases, all mutation, trimming, and chunk store operations are recursive functions with very deep call stacks. Actually, nautilus
uses a flattened array to store the tree, instead of the recursive tree structure. Thus, I will refactor the tree structure in the next weekyou are working totally on your own ... which is fine however it gives me the feeling I am not doing a good job :) do you need any assistance? or have any questions? anything where I can help?
you are working totally on your own ... which is fine however it gives me the feeling I am not doing a good job :) do you need any assistance? or have any questions? anything where I can help?
Hi @vanhauser-thc,
Thanks a lot for your kindness! You do help me a lot. Actually, for the time being, not too many technical questions. I will definitely contact you once I encounter any problems.
But it will be great, if you can give inputs about whether there are any potential improvements for the current grammar mutator, from a user's perspective.
BTW, after August 24, we need to submit the project for the final evaluation. Except for tasks in my TODO list, what else do you think I still need to do to pass the final evaluation? (e.g., tests, documents, etc.)
BTW, after August 24, we need to submit the project for the final evaluation. Except for tasks in my TODO list, what else do you think I still need to do to pass the final evaluation? (e.g., tests, documents, etc.)
@andreafioraldi what is your opinion?
IMHO it is
For me it is very important that this mutator greatly outperforms the afl generic mutator on targets like ruby or jsc. +1 also for the "for dummies" readme and for the doc. You have time, and I am confident that everything will turn out great. Don't hesitate to ask for the last difficult part that you have to code, the parsing stage. You can also easily reach the Nautilus authors for any question directly to them, just mail Cornelius and say that you are a student for AFL++ for a super-fast response.
I see. Thanks for your responses! I will try to figure out soon the parsing stage and other supporting documents for the next two weeks.
BTW, good luck for your WOOT presentation.😏
Will give the weekly report later today, as I had something urgent yesterday. Sry for the delay.
@vanhauser-thc and @andreafioraldi, I have three questions:
fuzzer_stats
. Do I need to monitor this file frequently while running the fuzzer? Or other better approaches?afl_custom_fuzz_steps
, to control the running times of the custom mutator? In nautilus
, each mutation may run at least 100 times for a given queue entry, but our current custom mutator cannot control the running steps (i.e., stage_max
in fuzz_one_original
). If the running times can be controlled by the custom mutator, I can add a deterministic mutation strategy and also control the running steps of the grammar mutator.nautilus
does not parse the input test cases but generates 1000 inputs while starting the fuzzer. I want to spend more time optimizing the performance (e.g., coverage). What do you think?1. This one is a little bit naive. How do you often plot a graph describing the edge coverage against the time? I noticed there is some related information in `fuzzer_stats`. Do I need to monitor this file frequently while running the fuzzer? Or other better approaches?
I think never :) I look what coverage is reached after a specific time though. Not that I do not think it is not worth it, rather I would want afl-plot to be able to plot from several runs together so it is comparable. never got down to code that though. (it is an item in TODO.md)
This is useful though (needs dev branch though): afl-showmap -C -i out -o /dev/null -- target and -i is the -o of afl-fuzz. This will analyze all edges discovered in the queue and print the number. requires llvm 9+ and either afl-clang-fast or afl-clang-lto in default instrumentation.
2. Could I add one more custom mutator hook, `afl_custom_fuzz_steps`, to control the running times of the custom mutator? In `nautilus`, each mutation may run at least 100 times for a given queue entry, but our current custom mutator cannot control the running steps (i.e., `stage_max` in `fuzz_one_original`). If the running times can be controlled by the custom mutator, I can add a deterministic mutation strategy and also control the running steps of the grammar mutator.
if AFL_CUSTOM_MUTATOR_ONLY is used then this would only affect the number of inputs generated per run queue entry per cycle right? does not sound that useful IMHO
for runs without _ONLY I can see the benefit though.
for me it is fine if you add it, just remember to add it do the documentation.
3. I may hold off the parsing part because I noticed that `nautilus` does not parse the input test cases but generates 1000 inputs while starting the fuzzer. I want to spend more time optimizing the performance (e.g., coverage). What do you think?
for me the order you do it is not important. IMHO parsing the inputs is important though so this should be implemented within the project.
Hi @vanhauser-thc and @andreafioraldi ,
Thanks for your response!
I guess increasing the number of inputs generated per fuzz_one_original
can help increase the coverage, but I need to confirm my hypothesis by manipulating the number of running times. If it does help, I will submit a pull request to AFL++. Will keep you updated about this.
I agree that paring the inputs is important so that users can reuse the previous input corpus. I will try my best to implement this.
The progress of the last week is a little bit slow. I mainly focused on trying to optimize the coverage of the grammar mutator, but got a bit lost on how to improve the coverage. Specifically, I refactored 1_c_gen.py
to maximize the size of the generated tree. Overall, the performance of the grammar mutator is still not good enough.
There are several tasks for next week, including the delayed tasks
I guess increasing the number of inputs generated per fuzz_one_original can help increase the coverage, but I need to confirm my hypothesis by manipulating the number of running times. If it does help, I will submit a pull request to AFL++. Will keep you updated about this.
Custom mutator is using the same number of iteration of havoc, so yes maybe we should add a custom mutator API that manipulates the number of iterations. afl_custom_fuzz_steps sounds ok.
Implement the parsing part
You will get coverage mutating good seed IMO, not generating from scratch, so you should prioritize this before optimizing things. Think also about how to parse inputs that are synced from other fuzzer instances running in parallel.
For the parsing part, maybe you should just use a parser generator like ANTLR or bison.
You can just define a compatibility layer (so that for us will be easy to add other parser generators in the future) that is called by your generated parser in eg. ANTLR C++ (yes marc said C++ bad but ANTLR does not emit C) to build your tree.
This can be helpful i think https://tomassetti.me/getting-started-antlr-cpp/
Note that you can easily parse and ANTLR4 grammar file using ANTLR itself, https://github.com/antlr/grammars-v4/tree/master/antlr/antlr4, so should be easy to add ANTLR4 as format alongside json for grammars. For json, it is easy to generate in python an ANTLR grammar from a json grammar.
@andreafioraldi Thanks for your information!
ANTLR can definitely help with the parsing part. You are right: we need a compatibility layer to convert ParseTree
in ANTLR to our tree representation.
I will take a look at the links you recommended.
Hi @vanhauser-thc and @andreafioraldi ,
For this week, mainly focus on:
TODO for next week:
afl_custom_fuzz_steps
) for the custom mutatorNext week ... there is not next week :) you have to hand in everything final on Monday next week. I am away over the weekend. I would push everything as far as you can on Thursday night, so I can review and test on Friday morning before I leave at lunch time. Then you have feedback to work on over the weekend.
Concentrate on code, documentation can still be done after Thursday evening.
I can implement the afl_custom_fuzz_steps
hook in afl-fuzz for you. it is not part of the grammar mutator so I think it is fine if we take care of that. what do you need as inputs? I would pass the data, len and filename? or the queue structure pointer?
Hi @vanhauser-thc ,
Yeah, I know no more week for GSoC. "Next" week, I mean, is this week (Aug 24-28), as the report is written yesterday on my local time ;)
Got it. I will try to wrap up everything and get back to you on Thursday night.
I envision afl_custom_fuzz_steps
would have the same function signature as afl_custom_init_trim
.
int32_t afl_custom_fuzz_steps(my_mutator_t *data, uint8_t *buf,
size_t buf_size);
This new hook will be used by afl-fuzz
to control the stage_max
for a custom mutator (afl-fuzz-one.c#L1669). It should be invoked before afl-fuzz-one.c#L1687. I think this would be enough.
It is in there. function signature:
u32 (*afl_custom_fuzz_count)(void *data, const u8 *buf, size_t buf_size);
Thank you so much! I will integrate it into the grammar mutator soon.
Maybe the coverage issue is related to the generator.
I observed some jsons created with ./grammar_generator and I noted that the generator prefers to go in deep in the grammar. You can see that most of the generated files have a lot of whitespaces and escape chars like \t \f etc.
If you look at json_grammar.json you can see that escape chars are at a deeper level in the grammar compared to other tokens that can be placed in a string:
"<string>": [["\"", "<characters>", "\""]],
"<characters>": [["<character-1>"]],
"<character>": [["0"], ["1"], ["2"], ["3"], ["4"], ["5"], ["6"], ["7"],
["8"], ["9"], ["a"], ["b"], ["c"], ["d"], ["e"], ["f"],
["g"], ["h"], ["i"], ["j"], ["k"], ["l"], ["m"], ["n"],
["o"], ["p"], ["q"], ["r"], ["s"], ["t"], ["u"], ["v"],
["w"], ["x"], ["y"], ["z"], ["A"], ["B"], ["C"], ["D"],
["E"], ["F"], ["G"], ["H"], ["I"], ["J"], ["K"], ["L"],
["M"], ["N"], ["O"], ["P"], ["Q"], ["R"], ["S"], ["T"],
["U"], ["V"], ["W"], ["X"], ["Y"], ["Z"], ["!"], ["#"],
["$"], ["%"], ["&"], ["\""], ["("], [")"], ["*"], ["+"],
[","], ["-"], ["."], ["/"], [":"], [";"], ["<"], ["="],
[">"], ["?"], ["@"], ["["], ["]"], ["^"], ["_"], ["`"],
["{"], ["|"], ["}"], ["~"], [" "], ["<esc>"], ["'"]],
"<esc>": [["\\","<escc>"]],
"<escc>": [["\\"],["b"],["f"], ["n"], ["r"],["t"],["\""]],
Why the generator prefers to generate escc tokens? Is it a bug in f1?
Seems like if in a rule there is
Idk if this can be the cause, but looks at the generated C. In function gen_node_character, there is this snipper before the switch case (i added some comments):
if (rule_index < 0 || rule_index >= 95) {
int num_rules = 0;
if (max_len < 3) {
num_rules = 94;
} else {
num_rules = 95; // here with max_len >= 3, seems likely
}
val = 0;
if (num_rules == 94) { // likely num_rules == 95, goto else branch
val = map_rand(94);
} else {
// random(95-94) -> random(1) -> always 0
val = map_rand(num_rules - 94) + 94; // always 94, generate always esc node
}
} else {
val = rule_index;
}
...
switch(val) {
....
case 94:
...
subnode = gen_node_esc(subnode_max_len, &subnode_consumed, -1);
...
break;
}
return node;
I don't have the full picture of the code, maybe it is intended, but the generated seeds looks strange
In 100 seeds, if I grep for \t I have 50 lines, just 2, if I grep for W. Seems reasonable that nonterminals have more priority (is it? maybe not) but not at this level, basically terminals that are not in the last levels of the grammar seem almost never generated.
Hi Shengtuo,
please put your weekly progress reports in this issue. Besides, I have not seen any progress the last 5 days. Do you need assistance?