Closed ssarfaty closed 2 months ago
Can you share the crashes that the other fuzzers found? The issue I see with your fuzzing setup is that you limit the input sample size to 50 (see: int read_max = 50;
). While this is sufficient to construct a sample that goes out-of-bounds, depending on the stack layout created by the compiler, this might not be sufficient to corrupt anything on the stack and cause a crash.
For example, let's take the following (optimal!) 50-bye input:
()()()()()()()()()()()()()()()()()()()()()()()()()
If I compile your target on Linux with either clang or gcc and use the above input, it isn't sufficient to cause a crash. I didn't do any testing on Windows, however I expect it to behave similarly.
However, if I remove the 50-byte limit from your harness, then Jackalope also finds a crash in several minutes. Here is an example hexdump of one of Jackalope-discoved crashes:
00000000 28 f0 27 e1 e2 29 28 f0 27 e1 e2 29 28 f0 27 e1 |(.'..)(.'..)(.'.|
00000010 e2 29 28 f0 27 e1 e2 29 28 f0 27 e1 e2 29 28 f0 |.)(.'..)(.'..)(.|
00000020 27 e1 e2 29 28 f0 27 e1 e2 29 28 f0 27 e1 e2 29 |'..)(.'..)(.'..)|
00000030 28 f0 28 f0 27 e1 e2 29 28 f0 27 e1 e2 29 28 f0 |(.(.'..)(.'..)(.|
00000040 27 e1 e2 29 28 f0 27 e1 e2 29 28 f0 27 e1 e2 29 |'..)(.'..)(.'..)|
00000050 28 f0 27 e1 e2 29 28 f0 27 e1 e2 29 28 f0 27 e1 |(.'..)(.'..)(.'.|
00000060 e2 29 28 f0 28 f0 27 e1 e2 29 28 f0 27 e1 e2 29 |.)(.(.'..)(.'..)|
00000070 28 f0 27 e1 e2 29 28 f0 27 e1 e2 29 28 f0 27 e1 |(.'..)(.'..)(.'.|
00000080 e2 29 28 f0 27 e1 e2 29 28 f0 27 e1 e2 29 28 f0 |.)(.'..)(.'..)(.|
00000090 27 e1 e2 29 28 f0 28 f0 27 e1 e2 29 28 f0 27 e1 |'..)(.(.'..)(.'.|
000000a0 e2 29 28 f0 27 e1 e2 29 28 f0 27 e1 e2 29 28 f0 |.)(.'..)(.'..)(.|
000000b0 27 e1 e2 29 28 f0 27 e1 e2 29 28 f0 27 e1 e2 29 |'..)(.'..)(.'..)|
000000c0 28 f0 27 e1 e2 29 28 f0 28 f0 27 e1 e2 29 28 f0 |(.'..)(.(.'..)(.|
000000d0 27 e1 e2 29 28 f0 27 e1 e2 29 28 f0 27 e1 e2 29 |'..)(.'..)(.'..)|
000000e0 28 f0 27 e1 e2 29 28 f0 27 e1 e2 29 28 f0 27 e1 |(.'..)(.'..)(.'.|
000000f0 e2 29 28 f0 27 e1 e2 29 28 f0 27 e1 e2 29 |.)(.'..)(.'..)|
However, even if Jackalope already discovers it (without the 50-byte limit), I did take an opportunity to tweak the default Jackalope mutator settings in https://github.com/googleprojectzero/Jackalope/commit/ad14a3eecf3a720485566376ab722a0f2b31e950. With those, Jackalope finds this bug even faster.
in the linux (ubuntu 16 32bit version) i have taken today latest version of AFL++ and compile the same code (see attached code that was compiled with linux modifications):
afl-gcc a.c -O3 -o a.elf
after ~50 sec got the first crash attach crash file to this post crash.txt c_code.txt
i have tested the crash file that was found on AFL++ from the Linux OS back to the Windows OS and it doesn't create a crash on windows ...
so i guess there is some misalignment on what i have expected and what different OS will cause with different compilers..
I just tried to compile on 64-bit Debian and the way c_code.txt
compiles (even with afl-gcc
from AFLplusplus
repo), it does not cause a crash with crash.txt
. I also ran afl++ for 10 minutes now without seeing the crash.
Another explanation is that your compiler or fuzzer setup includes some form of address anitizer / ASAN (via an environment variable, perhaps?). In that case, a crash would be caused as soon as you go past the end of the buffer (I believe crash.txt
will write only one byte past the end of the buffer, this would be difficult to detect without ASAN imo since the stack buffer is 25 bytes and even on a 32-bit OS the stack will be at least 4-byte aligned and possibly 8 or even 16-byte aligned to get correct alignment when reading XMM registers so you would need to write to reach at least byte 28 or 32 to effectively go out of bounds).
Being geared more towards targets where only the binary is available and not the source code, Jackalope doesn't use asan by default (since it requires having source code). However, there is a mode in Jackalope that works with asan and that is https://github.com/googleprojectzero/Jackalope/blob/main/README_sancov.md. Note that for this mode, the target needs to be prepared in a different way so that it correctly communicates the status with Jackalope (see sancovtest.cpp
for an example).
I attached your target modified for Jackalope asan mode. copy.txt
If you then do
clang++ -fsanitize=address -fsanitize-coverage=trace-pc-guard -g copy.cpp sancovclient.cpp -ocopy
./fuzzer -instrumentation sancov -in in -out out -t 1000 -delivery shmem -iterations 10000 -- ./copy -m @@
Jackalope would then be able to find the crash even when going 1 byte out of bounds / with 50 byte input limit.
i know that in x64 the sample doesn't crash thats why i have mention that in the examples i alway setup the compilation of the example with gcc flag of "-m32" to make sure it generate a 32bit executable and in that form the crash in AFL++ is found ..
to spare this stage for me i always like to hold VM that is natively 32bit and thats why i have used old version of Ubuntu 16. i have verified that the issue still triggered in Ubuntu 22 x64_x86 while passing the "-m32" flag in the compilation of afl-gcc.
so i don't have any form of "ASAN" included in my ELF file just target "-m32" - i have attached it to this thread (the txt extension is just to make sure it wouldn't be block in the upload)
i specifically wanted to compile it as an example of a binary to try out without ASAN so the example you gave me on ASAN is less relevant for me and i experiment mainly binary - the fact i have source wasn't the point.
Got it, thanks for additional info. Interestingly, I can't get it to compile in a way it crashes on my machine, even with afl-fuzz -m32
. This is how stack layout looks like for me (from IDA):
char lbuf[25]; // [esp+1Bh] [ebp-29h] BYREF
int v49; // [esp+34h] [ebp-10h]
int v50; // [esp+38h] [ebp-Ch]
int v51; // [esp+3Ch] [ebp-8h]
int v52; // [esp+40h] [ebp-4h]
Interestingly, lbuf is not 4-byte aligned (at ebp-29h
) so indeed going even one byte after the end of the buffer should corrupt the next value on the stack. However, in my case the compiler inserts 16 additional bytes on the stack (variables v49 - v52) which appear to only be used at the start of the context to save registers when calling _afl_maybe_log(19948);
.
Unfortuanately, I can't run your binary to see how it looks there.
But if you have a binary you can run on linux that crashes, then Jackalope running on Linux should find the crash :)
i don't know if it's just a typo from your side when you said:
afl-fuzz -m32
the compilation was with afl-gcc and not afl-fuzz since afl-fuzz doesn't need any -m32 flag.
as for runing the code with Jackalop on linux , i have build Jackalop on ubuntu v24 x64 and compiled the target source code as followed:
gcc a.c -fstack-protector-strong -O3 -o a.elf -m32
then i have run the fuzzer as followed:
./fuzzer -in ../../in -out ../../out -t 1000 -delivery file -instrument_module a.elf -cmp_coverage -- ../../a.elf @@
but Jackalop gave me an error as followed: ` Fuzzer version 1.00 1 input files read Running input sample ../../in/sample.txt [!] WARNING: 32-bit Linux target detected. -patch_return_addresses flag might be needed. Instrumented module a.elf, code size: 4096 Exception at address (nil) Access address: (nil) [!] WARNING: Input sample resulted in a crash [-] PROGRAM ABORT : No interesting input files Location : SynchronizeAndGetJob(), /home/toor/Downloads/Jackalope/fuzzer.cpp:631
` so i have added the needed argument
./fuzzer -in ../../in -out ../../out -t 1000 -delivery file -instrument_module a.elf -cmp_coverage -patch_return_addresses -- ../../a.elf @@
after about 10 min it found the input that triggered the crash with your latest new mutator changes...
i have one last question for you in regards to the target offset argument that can be passed in the command line. how do i get the offset ? i know it's address when i load it in IDA , but how to get the offset i always get break point error on read when i give it a try (last tried it on Windows).
Yep, afl-fuzz -m32
was a typo, ment to be afl-gcc -m32
When fuzzing 32-bit programs on Linux, Jackalope requires -patch_return_addresses
most of the time (it would depend on the compiler, but so far all programs I've seen required it). Unfortunately, there is a performance penalty with -patch_return_addresses
but it shouldn't matter too much for a small target like this.
For the target offset, I usually look it up in the debugger: look up the address of the function, look up the base address of the module and subtract. IDA also doesn't necessarily load the module at base address 0, so even there subtracting the base might be required. If the function is in your own code, however, it's always easier to export it (see https://github.com/googleprojectzero/Jackalope/blob/main/test.cpp#L106) and reference it using -target_method
instead.
Speaking of which, in your Jackalope command line, you don't take advantage of persistence so your performance is probably much lower than expected. Try this instead:
./fuzzer -in in -out out -t 1000 -delivery file -instrument_module a.elf -target_module a.elf -target_method main -nargs 2 -iterations 10000 -persist -loop -cmp_coverage -patch_return_addresses -- ./a.elf @@
ok great.
indeed now it founded and triggered the issue after ~2 min.
i have use some PE info tool (CFF Explorer) to read EXE optional Header and found the ImageBase and substracted the VA that IDA shows and got the right offset.
now, in general since i haven't found documentation on this, what is expected address to give to Jackalop and how does this corollate to the number of "-nargs" ? what is the requirements for the function that i need to set the offset to ?
is that the function that hold already the data ? or that is the function that hold the argv with the file path name that is doing the file open and read ?
and what is "-nargs" stand for ? the place of argv in that function ? or the amount of args that function has ?
if you have a documentation on this i have missed it and would love leaning more about it.
thanks.
The requirements for the target function are the same as in WinAFL, see this section in README https://github.com/googleprojectzero/winafl?tab=readme-ov-file#how-to-select-a-target-function
-nargs
is the number of arguments of the target function. This is needed because, in persistent mode, Jackalope needs to restore the function arguments before each iteration.
thanks!
i highly appreciate your help and information sharing.
i think we can close this ..
hi, i have tested the following code of the known "copy_it" function from the paper: https://www.usenix.org/system/files/conference/woot12/woot12-final26.pdf
in afl and afl++ (takes about 1-3 min) and libfuzz (found in first seconds) this bug is found easily (compiled with GCC with target -m32 on linux) - in the input folder i have created on file with a text of "hello" and that's it. no real input but it was enough.
i haven't been able to find the issue with Jackalope while i used visual studio (2019) - Release - x86 with the following command given to the fuzzer:
C:\Jackalope-main\Jackalope-main\build\Release>fuzzer.exe -in input -out output -t 1000 -delivery file -instrument_module copy_it.exe -target_module copy_it.exe -target_method main -nargs 1 -cmp_coverage -- copy_it.exe @@
i have used the "main" function since the function becomes inline. i have run the following code for more than a day without any luck ..can you think on why this happen with your type of technique / mutators that cause this to be not effective ?
define BUFFERSIZE 25
define TRUE 1
define FALSE 0
include / Needed only for _O_RDWR definition /
include
include
include
include
include
pragma warning(disable : 4996)
int copy_it(char input) { char lbuf[BUFFERSIZE]; char c, p = input, d = &lbuf[0]; char ulimit = &lbuf[BUFFERSIZE - 10]; int quotation = FALSE; int rquote = FALSE;
}
//function taken from: https://gist.github.com/leonid-ed/5b9161531afdafe65bca void hexDump(char desc, void addr, int len) { int i; unsigned char buff[17]; unsigned char pc = (unsigned char)addr;
}
int main(int argc, char* argv[]) { int read_max = 50; unsigned int bytes_read = 0; char data[50];
}
Thanks, shai