pdf-rs / pdf

Rust library to read, manipulate and write PDF files.
MIT License
1.23k stars 119 forks source link

Failure to read PDF with image #198

Open nadenf opened 11 months ago

nadenf commented 11 months ago

Using this to extract data from images:

      for (i, o) in images.iter().enumerate() {
            let img = match **o {
                XObject::Image(ref im) => im,
                _ => continue
            };

            let data = img.image_data(&resolver)?;

            let mut rgb_img: RgbImage = ImageBuffer::new(img.width, img.height);
            rgb_img.copy_from_slice(data.as_bytes());

Fails with this error:

thread 'indexer_pdf::tests::test_indexing' panicked at 'source slice length (5250) does not match destination slice length (15750)', src/indexer/document/src/indexer_pdf.rs:127:21
stack backtrace:
   0: rust_begin_unwind
             at /rustc/903e279f468590fa3425f8aff7f3d61a5a873dbb/library/std/src/panicking.rs:593:5
   1: core::panicking::panic_fmt
             at /rustc/903e279f468590fa3425f8aff7f3d61a5a873dbb/library/core/src/panicking.rs:67:14
   2: core::slice::<impl [T]>::copy_from_slice::len_mismatch_fail
             at /rustc/903e279f468590fa3425f8aff7f3d61a5a873dbb/library/core/src/slice/mod.rs:3603:13
   3: core::slice::<impl [T]>::copy_from_slice
             at /rustc/903e279f468590fa3425f8aff7f3d61a5a873dbb/library/core/src/slice/mod.rs:3610:13

With this PDF: Sample1.pdf

s3bk commented 11 months ago

Not all images are RGB images. You have to take a look at color_space to find out what it is.