mdsteele / rust-cfb

Rust library for reading/writing Compound File Binary (structured storage) files
MIT License
46 stars 20 forks source link

High potential for panicking through indexing #18

Closed Ben-Lichtman closed 2 years ago

Ben-Lichtman commented 2 years ago

This is related to https://github.com/mdsteele/rust-cfb/issues/16

Reading thorough the source code, there seems to be many places where debug_asserts are used as guards for things like indexing, or places where indexing is used but the potential for indexing outside of the slice bounds is possible - especially through adversarial or fuzzed inputs.

This poses problems for library consumers since this library may now panic when fed arbitrary data rather than returning errors politely, and so any usage of this library on untrusted input must be enclosed in a catch_unwind to handle panics and prevent denial-of-service attacks.

Please could all potential invalid indexing (indexing based off file contents) be properly error-handled so the library can not crash when give arbitrary inputs. debug_assert should really only be used for internal sanity checking rather than making any claims about the input data.

Ben-Lichtman commented 2 years ago

I strongly suggest setting up a fuzzing harness for this library - My suggestion would be to use cargo fuzz - this makes the process very easy https://rust-fuzz.github.io/book/introduction.html

Ben-Lichtman commented 2 years ago

For your convenience, I have included 11 crashing files and 5665 infinite loops / exhaustion from my quick fuzzing run with cargo fuzz

fuzz_total.zip

And this is the harness I used with it:

#![no_main]
use cfb::CompoundFile;
use libfuzzer_sys::fuzz_target;
use std::io::{Cursor, Read};

fuzz_target!(|data: &[u8]| {
    let cursor = Cursor::new(data);
    let mut cfb = match CompoundFile::open(cursor) {
        Ok(cfb) => cfb,
        Err(_) => return,
    };
    let stream_paths = cfb
        .walk()
        .filter(|e| e.is_stream())
        .map(|e| e.path().to_path_buf())
        .collect::<Vec<_>>();

    let paths = stream_paths
        .into_iter()
        .map(|s| {
            let mut data = Vec::new();
            cfb.open_stream(&s)?.read_to_end(&mut data)?;
            Ok((s, data))
        })
        .collect::<Result<std::collections::HashMap<_, _>, std::io::Error>>()
        .unwrap();
});
mdsteele commented 2 years ago

Thanks. Totally agreed—it is the intention that the debug_asserts in this crate are only for checking internal invariants, and that invalid/untrusted input should never cause a panic. There's some amount of up-front validation when opening a CFB file (e.g. Allocator::validate()) that attempts to establish these invariants on untrusted input, so that we can freely debug_assert! thereafter. However, as we saw in issue #16, there is a significant gap between what validation is currently done (not enough) and what invariants the code currently assumes (too many).

Fuzzing would definitely help find and close more of these gaps. Unfortunately, rust-cfb is a spare-time project for me, and it seems like my development environment is not well set up for using cargo fuzz. So PRs are always appreciated. (-: In the meantime, I'll look into fixing some of the issues revealed by the fuzz examples you found.

Ben-Lichtman commented 2 years ago

I'm going to close this issue now since our fuzzing has not uncovered more of these panics so far