SEGV in afl_custom_fuzz_count

andreafioraldi commented 3 years ago

I'm trying to fuzz mruby using the testcases in mruby/test/t/ (and not the testcases generated with grammar_generator) to test the antlr shim and I get

==1141== Memcheck, a memory error detector
==1141== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==1141== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==1141== Command: afl-fuzz -i in -o out3 -m none -- ./bin/mruby @@
==1141== 
==1141== Conditional jump or move depends on uninitialised value(s)
==1141==    at 0x12775F: bind_to_free_cpu (afl-fuzz-init.c:215)
==1141==    by 0x10F6D4: main (afl-fuzz.c:1091)
==1141== 
==1141== Invalid read of size 4
==1141==    at 0x6D29781: afl_custom_fuzz_count (grammar_mutator.c:304)
==1141==    by 0x13AFD0: fuzz_one_original (afl-fuzz-one.c:1679)
==1141==    by 0x138004: fuzz_one (afl-fuzz-one.c:4893)
==1141==    by 0x10FC14: main (afl-fuzz.c:1437)
==1141==  Address 0x4 is not stack'd, malloc'd or (recently) free'd
==1141== 
==1141== 
==1141== Process terminating with default action of signal 11 (SIGSEGV)
==1141==  Access not within mapped region at address 0x4
==1141==    at 0x6D29781: afl_custom_fuzz_count (grammar_mutator.c:304)
==1141==    by 0x13AFD0: fuzz_one_original (afl-fuzz-one.c:1679)
==1141==    by 0x138004: fuzz_one (afl-fuzz-one.c:4893)
==1141==    by 0x10FC14: main (afl-fuzz.c:1437)
==1141==  If you believe this happened as a result of a stack
==1141==  overflow in your program's main thread (unlikely but
==1141==  possible), you can try to increase the size of the
==1141==  main thread stack using the --main-stacksize= flag.
==1141==  The main thread stack size used in this run was 8388608.
==1141== 
==1141== HEAP SUMMARY:
==1141==     in use at exit: 4,641,445 bytes in 44,091 blocks
==1141==   total heap usage: 76,448 allocs, 32,357 frees, 9,150,336 bytes allocated
==1141== 
==1141== LEAK SUMMARY:
==1141==    definitely lost: 0 bytes in 0 blocks
==1141==    indirectly lost: 0 bytes in 0 blocks
==1141==      possibly lost: 16,896 bytes in 2 blocks
==1141==    still reachable: 4,624,549 bytes in 44,089 blocks
==1141==         suppressed: 0 bytes in 0 blocks
==1141== Rerun with --leak-check=full to see details of leaked memory
==1141== 
==1141== For counts of detected and suppressed errors, rerun with: -v
==1141== Use --track-origins=yes to see where uninitialised values come from
==1141== ERROR SUMMARY: 3 errors from 2 contexts (suppressed: 0 from 0)

andreafioraldi commented 3 years ago

It is a null ptr deref here https://github.com/AFLplusplus/Grammar-Mutator/blob/stable/src/grammar_mutator.c#L304 From gdb:

gef➤  p data
$1 = (my_mutator_t *) 0x555555abb640
gef➤  p data->cur_rules_mutation_node
$2 = (node_t *) 0x0

Seems that list_pop_front returned NULL https://github.com/AFLplusplus/Grammar-Mutator/blob/stable/src/grammar_mutator.c#L288

andreafioraldi commented 3 years ago

In fact, just before the pop, we have that

gef➤  p *data->tree_cur->non_terminal_node_list
$4 = {
  head = 0x0, 
  tail = 0x0, 
  size = 0x0
}

andreafioraldi commented 3 years ago

Seems that the parser fails to parse the testcase. If I feed the fuzzer with a testcase generated with grammar_generator but without the generated tree file so that I trigger the parsing in the mutator, all is good.

andreafioraldi commented 3 years ago

we should be able to parse that simple ruby testcases. Is it a limitation of the grammar? Regex in the grammar can solve it?
we should handle the error and report the parser fail to the user, not segv

h1994st commented 3 years ago

The reason for this issue is that the total number of rules mutations is 0. I just fixed it.

For your case, the parser actually parses the test case, even though errors do exist. The parsing capability depends on the input grammar file. The ruby grammar file in our project is a simplified ruby grammar, which does not cover all Ruby syntax. Although the grammar file may be inconsistent with the input test case, for any parsing errors, the ANTLR shim will not terminate but save the error portion as a terminal node, such that we may not lose too much information on the original test case.

Regex just gives the user more flexibility to specify their own grammar, which could be an enhancement in the future.

AFLplusplus / Grammar-Mutator

SEGV in afl_custom_fuzz_count #8