Closed anfedotoff closed 7 months ago
A PR would be appreciated! Though ideally we'd correct the underlying reason for the size mismatch (and fix it to either return success or point out how the input jpeg was corrupt), rather than returning a vague "unsupported input" error or something
A PR would be appreciated! Though ideally we'd correct the underlying reason for the size mismatch (and fix it to either return success or point out how the input jpeg was corrupt), rather than returning a vague "unsupported input" error or something
I try to understand the root cause of this issue. The components of source buffer to copy we've got here. It looks legal, we just do some parsing of some image part and get these values. Here we construct destination buffer according to total_bytes(). Is the value of total bytes the maximum size of the buffer that could be? For now I don't have a clue where the best place to check the error... The easiest way is right before the copying.
The total_bytes() is meant to be the exact number of bytes required to store the image.
This also happens with webp images!
@IonImpulse That's probably a separate bug. Could you open an issue with an example webp that fails?
Here is some more debugging:
(gdb) p pixel_format
$3 = jpeg_decoder::decoder::PixelFormat::RGB24 // bytes_per_pixel = 3
(gdb) p total_pixels
$10 = 258
total_bytes() // 774
(gdb) p self.frame
$2 = core::option::Option<jpeg_decoder::parser::FrameInfo>::Some(jpeg_decoder::parser::FrameInfo {is_baseline: false, is_differential: false,
coding_process: jpeg_decoder::parser::CodingProcess::Lossless, entropy_coding: jpeg_decoder::parser::EntropyCoding::Huffman, precision: 16,
image_size: jpeg_decoder::parser::Dimensions {width: 86, height: 3},
output_size: jpeg_decoder::parser::Dimensions {width: 86, height: 3},
mcu_size: jpeg_decoder::parser::Dimensions {width: 6, height: 1},
components: alloc::vec::Vec<jpeg_decoder::parser::Component, alloc::alloc::Global> {buf: alloc::raw_vec::RawVec<jpeg_decoder::parser::Component, alloc::alloc::Global> {ptr: core::ptr::unique::Unique<jpeg_decoder::parser::Component> {pointer: core::ptr::non_null::NonNull<jpeg_decoder::parser::Component> {pointer: 0x555555734230}, _marker: core::marker::PhantomData<jpeg_decoder::parser::Component>}, cap: 3, alloc: alloc::alloc::Global}, len: 3}})
So, because of coding_process: jpeg_decoder::parser::CodingProcess::Lossless the output buffer is enlarged by two times here. Could we rely on total_bytes() when we have CodingProcess::Lossless?
Hm, it becomes a little clearer:
identify -verbose ./jpeg-afl++-out/casr/cl2/crash-01e1720db079eac4301b592e924d97b022f1639c
identify-im6.q16: Unsupported JPEG process: SOF type 0xc3 `./jpeg-afl++-out/casr/cl2/crash-01e1720db079eac4301b592e924d97b022f1639c' @ error/jpeg.c/JPEGErrorHandler/335.
Total_bytes is calculated as width * height * bytes_per_pixel
, so if it is wrong, then the decoder must be reporting one of those values incorrectly. Alternatively, those three could be right and the decode
call itself is producing the wrong number of bytes. Do you have a sense of which one it might be?
I suppose, decoder produces the wrong amount of bytes. Because of coding_process: jpeg_decoder::parser::CodingProcess::Lossless. It increases size by two (from 774 the total_bytes value, to 1548). But I also might be wrong. Maybe the problem is in such combination of decoder and total_bytes. This values are calculated independently and could be incompatible. Identify from imagemagick stops parsing this image and returns an error.
Is it clear yet which of them is correct? Lossless means the decoder will emit a very particular color representation. Maybe it's not using the expected one (i.e. wrong depth).
Is it clear yet which of them is correct? Lossless means the decoder will emit a very particular color representation. Maybe it's not using the expected one (i.e. wrong depth).
I suppose, the problem is somewhere here: https://github.com/image-rs/jpeg-decoder/blob/3f85d497495ecbc47abf4ad0db275767e58c8565/src/parser.rs#L158
identify -verbose ./jpeg-afl++-out/casr/cl2/crash-01e1720db079eac4301b592e924d97b022f1639c
identify-im6.q16: Unsupported JPEG process: SOF type 0xc3 `./jpeg-afl++-out/casr/cl2/crash-01e1720db079eac4301b592e924d97b022f1639c' @ error/jpeg.c/JPEGErrorHandler/335.
Because imagemagick stops parsing at this marker. Jpeginfo tool also stops parsing there. I'll try to find the difference between jpeg-decoder and jpeginfo. I also could provide a crash input. This input panics on libFuzzer fuzz target too.
Is it clear yet which of them is correct? Lossless means the decoder will emit a very particular color representation. Maybe it's not using the expected one (i.e. wrong depth).
I suppose, the problem is somewhere here: https://github.com/image-rs/jpeg-decoder/blob/3f85d497495ecbc47abf4ad0db275767e58c8565/src/parser.rs#L158
identify -verbose ./jpeg-afl++-out/casr/cl2/crash-01e1720db079eac4301b592e924d97b022f1639c identify-im6.q16: Unsupported JPEG process: SOF type 0xc3 `./jpeg-afl++-out/casr/cl2/crash-01e1720db079eac4301b592e924d97b022f1639c' @ error/jpeg.c/JPEGErrorHandler/335.
Because imagemagick stops parsing at this marker. Jpeginfo tool also stops parsing there. I'll try to find the difference between jpeg-decoder and jpeginfo. I also could provide a crash input. This input panics on libFuzzer fuzz target too.
I think, I'm wrong... libjpeg just doesn't support this marker.
root@splash:~/jpeginfo# LD_LIBRARY_PATH=/jpeg-9e/release/lib/ ./release/bin/jpeginfo -V
jpeginfo v1.7.1beta x86_64-unknown-linux-gnu (Feb 13 2023)
Copyright (C) 1996-2023 Timo Kokkonen
This program comes with ABSOLUTELY NO WARRANTY. This is free software,
and you are welcome to redistribute it under certain conditions.
See the GNU General Public License for more details.
libjpeg version: 9e 16-Jan-2022
Copyright (C) 2022, Thomas G. Lane, Guido Vollbeding
root@splash:~/jpeginfo# grep -rn "case M_SOF3:" -B 2 -A 9 /jpeg-9e/jdmarker.c
1139-
1140- /* Currently unsupported SOFn types */
1141: case M_SOF3: /* Lossless, Huffman */
1142- case M_SOF5: /* Differential sequential, Huffman */
1143- case M_SOF6: /* Differential progressive, Huffman */
1144- case M_SOF7: /* Differential lossless, Huffman */
1145- case M_JPG: /* Reserved for JPEG extensions */
1146- case M_SOF11: /* Lossless, arithmetic */
1147- case M_SOF13: /* Differential sequential, arithmetic */
1148- case M_SOF14: /* Differential progressive, arithmetic */
1149- case M_SOF15: /* Differential lossless, arithmetic */
1150- ERREXIT1(cinfo, JERR_SOF_UNSUPPORTED, cinfo->unread_marker);
This is fixed on the next-version-0.25
branch that switches to zune-jpeg
as the JPEG decoding backend.
I've already extensively fuzzed the zune-jpeg
crate in isolation. However, the integration might have other issues we are not yet aware of, so fuzzing it through the image
interface would be much appreciated.
This happens when I do some fuzzing with AFL++.
Expected
I suppose we don't want to panic
Actual behaviour
I was doing fuzzing with AFL++ using this wrapper. And I found this error:
source slice length (1548) does not match destination slice length (774)
. Here is the stacktrace:I've also done a small investigation. As we could see crash is occurred at decoder.rs:115. But before we've got a vector with wrong size at decoder.rs:109. This buffer is constructed here:
some gdb output at this point:
We could see here that decoded.len() is equal to
774
, but it hasVec<u16>
type. So, after conversion toVec<u8>
here we've got the wrong 1548 size of buffer . Maybe we could add some size checks to inread_image
for that? I could do a PR.Reproduction steps
If you want to reproduce this, I could provide a crash input and you could follow this instructions.