Source slice length (1548) does not match destination slice length (774) in JpegDecoder

anfedotoff commented 1 year ago

This happens when I do some fuzzing with AFL++.

Expected

I suppose we don't want to panic

Actual behaviour

I was doing fuzzing with AFL++ using this wrapper. And I found this error: source slice length (1548) does not match destination slice length (774). Here is the stacktrace:

#0  0x00007ffff7c7700b in raise () from /lib/x86_64-linux-gnu/libc.so.6                                                                                                                                     
#1  0x00007ffff7c56859 in abort () from /lib/x86_64-linux-gnu/libc.so.6                                                                                                                                     
#2  0x0000555555695d77 in panic_abort::__rust_start_panic::abort () at /rustc/270c94e484e19764a2832ef918c95224eb3f17c7/library/panic_abort/src/lib.rs:42                                                    
#3  0x0000555555695d66 in panic_abort::__rust_start_panic () at /rustc/270c94e484e19764a2832ef918c95224eb3f17c7/library/panic_abort/src/lib.rs:37                                                           
#4  0x000055555568a76c in std::panicking::rust_panic () at /rustc/270c94e484e19764a2832ef918c95224eb3f17c7/library/std/src/panicking.rs:736                                                                 
#5  0x000055555568a61a in std::panicking::rust_panic_with_hook () at /rustc/270c94e484e19764a2832ef918c95224eb3f17c7/library/std/src/panicking.rs:706,                                                       
#6  0x000055555568a359 in std::panicking::begin_panic_handler::{closure#0} () at /rustc/270c94e484e19764a2832ef918c95224eb3f17c7/library/std/src/panicking.rs:579",                                           
#7  0x0000555555688fec in std::sys_common::backtrace::__rust_end_short_backtrace<std::panicking::begin_panic_handler::{closure_env#0}, !> () at /rustc/270c94e484e19764a2832ef918c95224eb3f17c7/library/std/src/sys_common/backtrace.rs:137                                                                                                                                                                                    
#8  0x000055555568a062 in std::panicking::begin_panic_handler () at /rustc/270c94e484e19764a2832ef918c95224eb3f17c7/library/std/src/panicking.rs:575                                                        
#9  0x0000555555579f43 in core::panicking::panic_fmt () at /rustc/270c94e484e19764a2832ef918c95224eb3f17c7/library/core/src/panicking.rs:64                                                                 
#10 0x000055555557a4d2 in core::slice::{impl#0}::copy_from_slice::len_mismatch_fail () at /rustc/270c94e484e19764a2832ef918c95224eb3f17c7/library/core/src/slice/mod.rs:3289                                
#11 0x00005555555dbe95 in core::slice::{impl#0}::copy_from_slice<u8> (self=..., src=...) at /rustc/270c94e484e19764a2832ef918c95224eb3f17c7/library/core/src/slice/mod.rs:3296                              
#12 image::codecs::jpeg::decoder::{impl#2}::read_image<std::io::cursor::Cursor<&[u8]>> (self=..., buf=...) at /image/src/codecs/jpeg/decoder.rs:115
#13 0x00005555555f31cf in image::image::decoder_to_vec<u8, image::codecs::jpeg::decoder::JpegDecoder<std::io::cursor::Cursor<&[u8]>>> (decoder=...) at /image/src/image.rs:608
#14 0x00005555555d8f9d in image::dynimage::decoder_to_image<image::codecs::jpeg::decoder::JpegDecoder<std::io::cursor::Cursor<&[u8]>>> (decoder=...) at /image/src/dynimage.rs:1030
#15 image::dynimage::DynamicImage::from_decoder<image::codecs::jpeg::decoder::JpegDecoder<std::io::cursor::Cursor<&[u8]>>> (decoder=...) at /image/src/dynimage.rs:177
#16 0x00005555555e3147 in image::io::free_functions::load_inner::{impl#0}::visit_decoder<image::codecs::jpeg::decoder::JpegDecoder<std::io::cursor::Cursor<&[u8]>>> (decoder=..., self=<error reading variable: Cannot access memory at address 0x10>) at /image/src/io/free_functions.rs:109
#17 image::io::free_functions::load_decoder<std::io::cursor::Cursor<&[u8]>, image::io::free_functions::load_inner::LoadVisitor> (format=<optimized out>, limits=..., r=..., visitor=<error reading variable: Cannot access memory at address 0x10>) at /image/src/io/free_functions.rs:61
#18 image::io::free_functions::load_inner<std::io::cursor::Cursor<&[u8]>> (limits=..., format=<optimized out>, r=...) at /image/src/io/free_functions.rs:113
#19 image::io::free_functions::load<std::io::cursor::Cursor<&[u8]>> (r=..., format=<optimized out>) at /image/src/io/free_functions.rs:37
#20 0x00005555555d41a9 in image::dynimage::load_from_memory_with_format () at /image/src/dynimage.rs:1215
#21 sydr_script_jpeg::main () at /image/fuzz/fuzzers/sydr_script_jpeg.rs:15

I've also done a small investigation. As we could see crash is occurred at decoder.rs:115. But before we've got a vector with wrong size at decoder.rs:109. This buffer is constructed here:

 let mut decoded: Vec<u16> =
            vec![0u16; ncomp * output_size.width as usize * output_size.height as usize];

some gdb output at this point:

(gdb) p ncomp
$4 = 3
(gdb) p frame.output_size 
$5 = jpeg_decoder::parser::Dimensions {width: 86, height: 3}

We could see here that decoded.len() is equal to 774, but it hasVec<u16> type. So, after conversion to Vec<u8> here we've got the wrong 1548 size of buffer . Maybe we could add some size checks to in read_image for that? I could do a PR.

Reproduction steps

If you want to reproduce this, I could provide a crash input and you could follow this instructions.

fintelia commented 1 year ago

A PR would be appreciated! Though ideally we'd correct the underlying reason for the size mismatch (and fix it to either return success or point out how the input jpeg was corrupt), rather than returning a vague "unsupported input" error or something

anfedotoff commented 1 year ago

A PR would be appreciated! Though ideally we'd correct the underlying reason for the size mismatch (and fix it to either return success or point out how the input jpeg was corrupt), rather than returning a vague "unsupported input" error or something

I try to understand the root cause of this issue. The components of source buffer to copy we've got here. It looks legal, we just do some parsing of some image part and get these values. Here we construct destination buffer according to total_bytes(). Is the value of total bytes the maximum size of the buffer that could be? For now I don't have a clue where the best place to check the error... The easiest way is right before the copying.

fintelia commented 1 year ago

The total_bytes() is meant to be the exact number of bytes required to store the image.

IonImpulse commented 1 year ago

This also happens with webp images!

fintelia commented 1 year ago

@IonImpulse That's probably a separate bug. Could you open an issue with an example webp that fails?

anfedotoff commented 1 year ago

Here is some more debugging:

(gdb) p pixel_format
$3 = jpeg_decoder::decoder::PixelFormat::RGB24 // bytes_per_pixel = 3
(gdb) p total_pixels
$10 = 258

total_bytes() // 774

(gdb) p self.frame 
$2 = core::option::Option<jpeg_decoder::parser::FrameInfo>::Some(jpeg_decoder::parser::FrameInfo {is_baseline: false, is_differential: false,
 coding_process: jpeg_decoder::parser::CodingProcess::Lossless, entropy_coding: jpeg_decoder::parser::EntropyCoding::Huffman, precision: 16, 
image_size: jpeg_decoder::parser::Dimensions {width: 86, height: 3}, 
output_size: jpeg_decoder::parser::Dimensions {width: 86, height: 3}, 
mcu_size: jpeg_decoder::parser::Dimensions {width: 6, height: 1}, 
components: alloc::vec::Vec<jpeg_decoder::parser::Component, alloc::alloc::Global> {buf: alloc::raw_vec::RawVec<jpeg_decoder::parser::Component, alloc::alloc::Global> {ptr: core::ptr::unique::Unique<jpeg_decoder::parser::Component> {pointer: core::ptr::non_null::NonNull<jpeg_decoder::parser::Component> {pointer: 0x555555734230}, _marker: core::marker::PhantomData<jpeg_decoder::parser::Component>}, cap: 3, alloc: alloc::alloc::Global}, len: 3}})

So, because of coding_process: jpeg_decoder::parser::CodingProcess::Lossless the output buffer is enlarged by two times here. Could we rely on total_bytes() when we have CodingProcess::Lossless?

anfedotoff commented 1 year ago

Hm, it becomes a little clearer:

identify  -verbose ./jpeg-afl++-out/casr/cl2/crash-01e1720db079eac4301b592e924d97b022f1639c
identify-im6.q16: Unsupported JPEG process: SOF type 0xc3 `./jpeg-afl++-out/casr/cl2/crash-01e1720db079eac4301b592e924d97b022f1639c' @ error/jpeg.c/JPEGErrorHandler/335.

https://github.com/image-rs/jpeg-decoder/blob/9fb38adb7168e6cfd9d3516ce072b157b3ba8e0b/src/marker.rs#L73 Is it Ok?

fintelia commented 1 year ago

Total_bytes is calculated as width * height * bytes_per_pixel, so if it is wrong, then the decoder must be reporting one of those values incorrectly. Alternatively, those three could be right and the decode call itself is producing the wrong number of bytes. Do you have a sense of which one it might be?

anfedotoff commented 1 year ago

I suppose, decoder produces the wrong amount of bytes. Because of coding_process: jpeg_decoder::parser::CodingProcess::Lossless. It increases size by two (from 774 the total_bytes value, to 1548). But I also might be wrong. Maybe the problem is in such combination of decoder and total_bytes. This values are calculated independently and could be incompatible. Identify from imagemagick stops parsing this image and returns an error.

HeroicKatora commented 1 year ago

Is it clear yet which of them is correct? Lossless means the decoder will emit a very particular color representation. Maybe it's not using the expected one (i.e. wrong depth).

anfedotoff commented 1 year ago

Is it clear yet which of them is correct? Lossless means the decoder will emit a very particular color representation. Maybe it's not using the expected one (i.e. wrong depth).

I suppose, the problem is somewhere here: https://github.com/image-rs/jpeg-decoder/blob/3f85d497495ecbc47abf4ad0db275767e58c8565/src/parser.rs#L158

identify  -verbose ./jpeg-afl++-out/casr/cl2/crash-01e1720db079eac4301b592e924d97b022f1639c
identify-im6.q16: Unsupported JPEG process: SOF type 0xc3 `./jpeg-afl++-out/casr/cl2/crash-01e1720db079eac4301b592e924d97b022f1639c' @ error/jpeg.c/JPEGErrorHandler/335.

Because imagemagick stops parsing at this marker. Jpeginfo tool also stops parsing there. I'll try to find the difference between jpeg-decoder and jpeginfo. I also could provide a crash input. This input panics on libFuzzer fuzz target too.

anfedotoff commented 1 year ago

Is it clear yet which of them is correct? Lossless means the decoder will emit a very particular color representation. Maybe it's not using the expected one (i.e. wrong depth).

I suppose, the problem is somewhere here: https://github.com/image-rs/jpeg-decoder/blob/3f85d497495ecbc47abf4ad0db275767e58c8565/src/parser.rs#L158
identify  -verbose ./jpeg-afl++-out/casr/cl2/crash-01e1720db079eac4301b592e924d97b022f1639c
identify-im6.q16: Unsupported JPEG process: SOF type 0xc3 `./jpeg-afl++-out/casr/cl2/crash-01e1720db079eac4301b592e924d97b022f1639c' @ error/jpeg.c/JPEGErrorHandler/335.
Because imagemagick stops parsing at this marker. Jpeginfo tool also stops parsing there. I'll try to find the difference between jpeg-decoder and jpeginfo. I also could provide a crash input. This input panics on libFuzzer fuzz target too.

I think, I'm wrong... libjpeg just doesn't support this marker.

root@splash:~/jpeginfo# LD_LIBRARY_PATH=/jpeg-9e/release/lib/  ./release/bin/jpeginfo -V                                                                                 
jpeginfo v1.7.1beta  x86_64-unknown-linux-gnu (Feb 13 2023)
Copyright (C) 1996-2023 Timo Kokkonen

This program comes with ABSOLUTELY NO WARRANTY. This is free software,
and you are welcome to redistribute it under certain conditions.
See the GNU General Public License for more details.

libjpeg version: 9e  16-Jan-2022
Copyright (C) 2022, Thomas G. Lane, Guido Vollbeding
root@splash:~/jpeginfo# grep -rn "case M_SOF3:" -B 2 -A 9 /jpeg-9e/jdmarker.c

1139-
1140-    /* Currently unsupported SOFn types */
1141:    case M_SOF3:           /* Lossless, Huffman */
1142-    case M_SOF5:           /* Differential sequential, Huffman */
1143-    case M_SOF6:           /* Differential progressive, Huffman */
1144-    case M_SOF7:           /* Differential lossless, Huffman */
1145-    case M_JPG:                    /* Reserved for JPEG extensions */
1146-    case M_SOF11:          /* Lossless, arithmetic */
1147-    case M_SOF13:          /* Differential sequential, arithmetic */
1148-    case M_SOF14:          /* Differential progressive, arithmetic */
1149-    case M_SOF15:          /* Differential lossless, arithmetic */
1150-      ERREXIT1(cinfo, JERR_SOF_UNSUPPORTED, cinfo->unread_marker);

Shnatsel commented 7 months ago

This is fixed on the next-version-0.25 branch that switches to zune-jpeg as the JPEG decoding backend.

I've already extensively fuzzed the zune-jpeg crate in isolation. However, the integration might have other issues we are not yet aware of, so fuzzing it through the image interface would be much appreciated.

image-rs / image