mdsteele / rust-cfb

Rust library for reading/writing Compound File Binary (structured storage) files
MIT License
44 stars 20 forks source link

Reading is very slow for larger files #57

Open jon-zu opened 1 week ago

jon-zu commented 1 week ago

I'm using the msi crate, but after some profiling, I think this crate might be the better place to report that.

Code:

        let setup_file = BufReader::new(File::open("out_is/Setup.msi").unwrap());
        let mut msi = msi::Package::open(setup_file).unwrap();
        let mut stream = msi.read_stream("Data1.cab").unwrap();
        let mut max = stream.take(1024 * 1024 * 50); 
        let mut out_dir = BufWriter::new(File::create("out_is/Data1.cab").unwrap());
        std::io::copy(&mut max, &mut out_dir).unwrap();

Reading 50 Mb of that cab stream takes roughly one minute on my pc. I've also created a flame graph, but I'm not really sure where the problem lies. Is the loop in Chain::new taking a lot of time maybe? flamegraph

mdsteele commented 1 week ago

Just to check, is this timing coming from a --release build?

jon-zu commented 6 days ago

Just to check, is this timing coming from a --release build?

Yes, I can see later today If I can make a small, reproducible example. I'm making a small tool to extract InstallShield installer, where I extract the .msi from and exe.

jon-zu commented 5 days ago

I've built a small example here. 7z can extract this in a few seconds, It seems the StreamReader is slower than 1mb/s.

MSI file(~1gb): https://drive.google.com/file/d/1TqfK1UDhhv1pbfKZe0yWDAHZ-_azVsoT/view?usp=drive_link

Flamegraph for the example(I've disabled printing for that one): flamegraph

It seems the svg is not really working here, I've also added It to the repo.