Closed philippeitis closed 4 years ago
So with this PR I was getting index out of bounds when I tried this sample code with the book you sent me or with any book i tried to open.
use mobi::Mobi;
use std::env;
fn main() -> Result<(), std::io::Error> {
let m = Mobi::from_path(env::args().skip(1).take(1).next().expect("Missing argument"))?;
Ok(())
}
❯ RUST_BACKTRACE=1 ./target/release/mobi ~/dev/rust/mobi-rs/example.mobi
thread 'main' panicked at 'index out of bounds: the len is 30 but the index is 33', src/lz77.rs:30:58
stack backtrace:
0: rust_begin_unwind
at /rustc/18bf6b4f01a6feaf7259ba7cdae58031af1b7b39/library/std/src/panicking.rs:475
1: core::panicking::panic_fmt
at /rustc/18bf6b4f01a6feaf7259ba7cdae58031af1b7b39/library/core/src/panicking.rs:85
2: core::panicking::panic_bounds_check
at /rustc/18bf6b4f01a6feaf7259ba7cdae58031af1b7b39/library/core/src/panicking.rs:62
3: mobi::lz77::decompress_lz77
4: mobi::record::Record::parse_records
5: mobi::Mobi::from_reader
6: mobi::Mobi::from_path
7: mobix::main
What fixed it is checking if the offset is within the bounds of text not the data length. I will display it below.
Thanks for all the work :) I'll try to release a fixed version soon. I want to include some more changes.
Its possible for the mobi header to be a length other than 232, and have padding bytes of 0xFFFF_FFFF at the end. The previous assumption that this wasn't possible caused a bug where parsing the exth data would read these padding bytes 0xFFFF_FFFF - and then the program would try to read 4GB of data from the mobi.
This PR changes ExtHeader::parse to take the header_len and ExtHeader::get_records to take an offset to find the EXTH header. A few other stylistic changes are included (which should hopefully make things easier to work with).
I have also simplified lz77.rs to avoid using Cursor - which allows us to return a
Vec<u8>
instead of anio::Error<Vec<u8>>
and prevents unexpected errors.This resolves #5, and at least for me, makes all of my mobi files open correctly.
As an additional note, since this crate is provided as a library, it would be nice if parsing the headers returned errors for unexpected / problematic values, since this would help make identifying malformed headers a lot easier (and would make this library a lot safer to use in other applications). For instance, testing that the record lengths are correct (no longer than the file length and at least 8 elements long), or that the EXTH identifier is always EXTH would have been helpful when debugging this specific problem, and I imagine it would make future issues a lot easier to resolve.