meme / hotwax

Coverage-guided binary fuzzing powered by Frida Stalker
The Unlicense
179 stars 21 forks source link

Handle ASLR #1

Open meme opened 4 years ago

meme commented 4 years ago

Currently the executable is compiled with -fno-pie which is NOT AT ALL the way this should be handled. Because coverage data is sent across processes, the same binaries should have the same ASLR slide so AFL does not confuse the slide for new basic block coverage.

WorksButNotTested commented 4 years ago

For comparison, the approach used by afl-clang is to add instructions to the code at each branch and to have each update the bitmap using a random number as it's input. This can be seen here.

The current approach of hotwax seems to be to use the relative virtual address (RVA) of the current instruction pointer. Seen here. This is guaranteed to remain stable irrespective of the location the loader places the binary. Indeed, even if the virtual address were used, then this should not change when the process is forked, but this may cause a problem if the bitmap is stored and expected to be the same if the test case is re-run, or if bitmaps are compared between multiple AFL instances when performing parallel fuzzing. The current RVA approach addresses this, but may lead to collisions between modules. Perhaps a non-variable element from the module could be added to the RVA to account for this, e.g. the index of the module in the module list, or a hash of its name etc.

However, since the RVA is used, this is likely to have a much less random distribution than afl-clang and therefore may result in worse branch detection performance.

qemu-AFL uses the full virtual address as its input to the bitmap. Seen here. Although, it is suspect that it doesn't perform randmization of module load addresses. So perhaps true randomization is not that imporant.

vanhauser-thc commented 4 years ago

afl-clang/afl-gcc are outdated and worst choice options for instrumentation and is a bad example for everything:)

for a better likeliness you should take a look at qemu_mode:

  cur_loc = regs->pc;
  cur_loc = (cur_loc >> 4) ^ (cur_loc << 8);
  cur_loc &= MAP_SIZE - 1;

for ASLR you can disable that in the harness with personality(ADDR_NO_RANDOMIZE); but would only affect dlopen'ed libraries (as you can see in afl++ in examples/afl_frida/afl-frida.c)

qemu-AFL uses the full virtual address as its input to the bitmap. Seen here. Although, it is suspect that it doesn't perform randmization of module load addresses. So perhaps true randomization is not that imporant.

qemu does its own loading and therefore is not affected by ASLR. it always loads to the same addresses.

meme commented 4 years ago

For personality(ADDR_NO_RANDOMIZE);, we can have a program that simply triggers a personality change and then forks, no? This would mean that both the target program and all dlopen'ed binaries have no slide.

vanhauser-thc commented 4 years ago

yes/no. if you do set the personality flag in the main() it only affects following dlopen calls.

if you fork() the adddress space of the linked libraries and main() etc. stay the same as they are.

meme commented 4 years ago

OK, I think that forcing compilation with -fno-pie, then calling personality(ADDR_NO_RANDOMIZE) (perhaps this is not necessary with -fno-pie?) in main and then kicking off the fuzzing loop should accomplish this then. I do not want to have to add overhead for ASLR handling so I think it is in our best interest to just disable it entirely. Does this seem reasonable?

WorksButNotTested commented 4 years ago

I guess you wouldn't want to rely upon being able to rebuild the original binary with -fno-pie though? I guess best case would be that the process can make use of ASLR, but the value used to update the bitmap is derived from the result of removing the slide from the current instruction pointer? Obviously if just this was used, it would cause collisions between the same offset within different modules and hence perhaps you might want to replace the module base with something else module specific, but not influenced by ASLR, e.g. the index of the module in the module list, or a checksum of its path etc?

If it is necessary to remove ASLR from the equation entirely though, perhaps this could be done by using LD_PRELOAD to load a binary which simply uses personality to disable it. By preloading, it should load before the loader loads any subsequent modules and in theory cause the change to take effect before the actual program is loaded.

If we want to remote the need to compile the target to be fuzzed (which may be quite useful in some circumstances) then it won't be statically linked to the hotwax and FRIDA devkit stuff and this would then be a separate shared object. Since this portion wouldn't want to be fuzzed anyway, then whether it is loaded using ASLR or not should not impact the coverage information and hence this separate shared object could call personality and would also be suitable for preloading?

vanhauser-thc commented 4 years ago

if you fuzz shared libraries than you don't need it that complex, just make that personality + dlopen

Fuzzing binary only programs ... that you compile yourself ... does not make sense IMHO. if you have source instrumentation it will always be better because you can instrument cmp, string compares, etc. with little overhead.

WorksButNotTested commented 4 years ago

Yeah of course. Just you can't dlopen an executable, but you can use ldpreload on them. PIE "executables" may be ok as they are actually shared libraries anyway. Just not 100% that if you use personality in the preloaded shared object it will take effect in time.

Makes sense like you say to do source instrumentation if you can compile the source into a representative binary.

meme commented 4 years ago

To clarify on my -fno-pie comment: I was wondering more if enabling -fno-pie on hotwax itself implied that all libraries loaded should also not not use ASLR. The point of hotwax is to fuzz closed-source software that you cannot compile-time instrument. Regardless, I will investigate with personality(ADDR_NO_RANDOMIZE), but I imagine that we will end up using the module map to calculate some "fixed" slide that is calculated per module to ensure that modules do not overlap but also do not slide each time we fork().