Closed Ben-Lichtman closed 2 years ago
I strongly suggest setting up a fuzzing harness for this library - My suggestion would be to use cargo fuzz
- this makes the process very easy https://rust-fuzz.github.io/book/introduction.html
For your convenience, I have included 11 crashing files and 5665 infinite loops / exhaustion from my quick fuzzing run with cargo fuzz
And this is the harness I used with it:
#![no_main]
use cfb::CompoundFile;
use libfuzzer_sys::fuzz_target;
use std::io::{Cursor, Read};
fuzz_target!(|data: &[u8]| {
let cursor = Cursor::new(data);
let mut cfb = match CompoundFile::open(cursor) {
Ok(cfb) => cfb,
Err(_) => return,
};
let stream_paths = cfb
.walk()
.filter(|e| e.is_stream())
.map(|e| e.path().to_path_buf())
.collect::<Vec<_>>();
let paths = stream_paths
.into_iter()
.map(|s| {
let mut data = Vec::new();
cfb.open_stream(&s)?.read_to_end(&mut data)?;
Ok((s, data))
})
.collect::<Result<std::collections::HashMap<_, _>, std::io::Error>>()
.unwrap();
});
Thanks. Totally agreed—it is the intention that the debug_assert
s in this crate are only for checking internal invariants, and that invalid/untrusted input should never cause a panic. There's some amount of up-front validation when opening a CFB file (e.g. Allocator::validate()
) that attempts to establish these invariants on untrusted input, so that we can freely debug_assert!
thereafter. However, as we saw in issue #16, there is a significant gap between what validation is currently done (not enough) and what invariants the code currently assumes (too many).
Fuzzing would definitely help find and close more of these gaps. Unfortunately, rust-cfb
is a spare-time project for me, and it seems like my development environment is not well set up for using cargo fuzz
. So PRs are always appreciated. (-: In the meantime, I'll look into fixing some of the issues revealed by the fuzz examples you found.
I'm going to close this issue now since our fuzzing has not uncovered more of these panics so far
This is related to https://github.com/mdsteele/rust-cfb/issues/16
Reading thorough the source code, there seems to be many places where
debug_assert
s are used as guards for things like indexing, or places where indexing is used but the potential for indexing outside of the slice bounds is possible - especially through adversarial or fuzzed inputs.This poses problems for library consumers since this library may now panic when fed arbitrary data rather than returning errors politely, and so any usage of this library on untrusted input must be enclosed in a
catch_unwind
to handle panics and prevent denial-of-service attacks.Please could all potential invalid indexing (indexing based off file contents) be properly error-handled so the library can not crash when give arbitrary inputs.
debug_assert
should really only be used for internal sanity checking rather than making any claims about the input data.