LibAFL does not poison over-allocated memory for an input

landaire commented 2 years ago

IMPORTANT

You have verified that the issue to be present in the current main branch

Yes

Describe the bug

LibAFL uses Vec<u8> as a backing structure for BytesInput which will over-allocate memory when inserting/appending bytes, and will not deallocate memory when removing bytes. LibAFL does not poison the delta between the vector's len() and capacity(), which may reduce ASAN's effectiveness in detecting small out-of-bounds reads since memory between the vector's len() and capacity() offsets are considered addressable.

Expected behavior

LibAFL should call into ASAN to poison the delta bytes, or provide an API to get the number of delta bytes, when handing off the BytesInput to the harness.

If this does not work, LibAFL should try to guarantee the minimum number of bytes are allocated by calling input.bytes_mut().reserve_exact() when inserting bytes.

Additional context

I noted this issue when I was evaluating using LibAFL as a backend for a LibFuzzer replacement I'm working on. I tried to see if LibAFL provides non-mutable access to the backing vector that would work in the harness callback to do this myself and couldn't find a way to do so. I then looked to see if LibAFL takes this into account on its own, and it appears as though it may be a gap in the current implementation.

The way I approach this problem is very similar to libFuzzer/honggfuzz's approach. I weak link the __{m,a}san_{un}poison_memory_region functions, and poison/unpoison the appropriate memory range at the start and end of the fuzzing loop (respectively).

landaire commented 2 years ago

As an aside, I wouldn't mind making a PR that either:

Exposes an immutable reference to the backing data structure so that harness implementations can choose how to poison/unpoison memory
or
Automatically poisons/unpoisons memory similar to my own approach when using InProcessExecutor

Let me know if this is indeed a valid issue first and what approach is preferable.

tokatoka commented 2 years ago

Maybe shrink or reserve_exact is not a solution here..? as the doc says

allocator may give the collection more space than it requests. Therefore, capacity can not be relied upon to be precisely minima

tokatoka commented 2 years ago

I think we can make a feature flag that are used combined with asan with that flag InProcessExecutor poisons/unpoisons the input region

andreafioraldi commented 2 years ago

I'm aware of this issue, we already approached in AFL++. The user atm can simply call the asan user poisoning API in the harness, but I guess if would be better to do it in the libfuzzer compatibility interface in libafl_targets I agree

andreafioraldi commented 2 years ago

I think we can make a feature flag that are used combined with asan with that flag InProcessExecutor poisons/unpoisons the input region

weak symbols will do the trick without any flag, maybe a bit a PITA on windows

andreafioraldi commented 2 years ago

@landaire isn't #[linkage = "extern_weak"] unstable? We have most of this code in C due to it, but maybe they stabilized it and I missed it. Coll project btw, if you need pointers to integrate libafl for scaling feel free to ping me on the fuzzing discord.

landaire commented 2 years ago

@landaire isn't #[linkage = "extern_weak"] unstable? We have most of this code in C due to it, but maybe they stabilized it and I missed it

Yes, it's a nightly-only feature.

The user atm can simply call the asan user poisoning API in the harness

Could you explain (even at a high level is fine) how that would be done with the current API? I didn't see an easy way to grab the necessary info from the &BytesInput in the harness callback function, but I was looking at the baby_fuzzer example so maybe there's a more advanced technique I missed.

landaire commented 2 years ago

Coll project btw, if you need pointers to integrate libafl for scaling feel free to ping me on the fuzzing discord.

Thanks! I may just take you up on that

AFLplusplus / LibAFL

LibAFL does not poison over-allocated memory for an input #598