strukturag / libheif

libheif is an HEIF and AVIF file format decoder and encoder.
Other
1.66k stars 294 forks source link

poc of HEIC crash #784

Open hackerfactor opened 1 year ago

hackerfactor commented 1 year ago

A user just uploaded to my site a file called "poc.heic" that causes libheif to crash.

The HEIC file structure is valid. The error is coming from somewhere deeper.

The HEIC file contains 4 images. Primary, thumb1, and thumb2 will crash. Thumb3 renders as really corrupted.

I have temporarily placed the file at https://hackerfactor.com/private/poc-heic.zip. The zip contains poc.heic.

farindk commented 1 year ago

Thanks, but I could not reproduce this. Even with older versions of libheif and libde265. Can you give more details? Which versions of libheif and libde265? Which OS? Stack-trace?

hackerfactor commented 1 year ago

@farindk I'm seeing the crash with old and new libraries.

Tested: Set1: "libaom.so.3.0.0", "libde265.so.0.1.1", "libx265.so.200", "libheif.so.1.11.0" (really old) Set2: "libaom.so.3.3.0", "libde265.so.0.1.1", "libx265.so.201", "libheif.so.1.12.0" (old) Set3: "libaom.so.3.6.0", "libde265.so.0.1.4", "libx265.so.207", "libheif.so.1.15.1" (new/current)

OS: Ubuntu 20.04.5 LTS NOTE: I'm compiling from source, not using the repository libraries.

Compiling method: aom: rm -rf aom git clone https://aomedia.googlesource.com/aom (cd aom/build ; cmake -DBUILD_SHARED_LIBS=1 .. ; make)

libde265: rm -rf libde265 git clone https://github.com/strukturag/libde265.git (cd libde265 ; ./autogen.sh ; ./configure --disable-sherlock265 ; make)

x265_git: rm -rf x265_git git clone https://bitbucket.org/multicoreware/x265_git.git (cd x265_git/build ; cmake ../source ; make)

libheif: rm -rf libheif git clone https://github.com/strukturag/libheif.git

Use gcc-4.8 for maximum cross-platform compatibility; not all Linux are binary compatible

    (cd libheif ; ./autogen.sh ; CC=gcc-4.8 CXX=g++-4.8 ./configure ; make)

I haven't tried a stack trace yet.

hackerfactor commented 1 year ago

Oh interesting... libheif/examples/ heif-info, heif-convert both work (no crash). Looks like the problem may be in how my code is calling the library. (Diving deeper.)

hackerfactor commented 1 year ago

Oh, I found the problem. The ispe says the image should be 1280x4439. However, the decoded image from heif_decode_image() is 1280x854. (I'm trying to copy 1280x4439, so it fails.)

I'm not seeing any decode errors. Shouldn't there be something that denotes a truncated image?

farindk commented 1 year ago

In my opinion, the ispe specification is bad. It (informally) specifies the raw image size before it is being transformed (rotated, cropped, ...). I don't see why this should be of any value to the user, who is only interested in the final image size. Moreover, the ispe size might vary from the actual encoded size. But we have to live with the standard as it is.

My advice would be to completely ignore the ispe content.

I'm cross linking this to the related #773.

hackerfactor commented 1 year ago

Without ispe, how do you recommend/propose tools like ExifTool identify the image size without rendering the image?

silverbacknet commented 1 year ago

As far as I can tell, if there are any transforms beyond a simple crop, you basically have to run the entire compositing engine anyway to obtain the final canvas size, even if you don't perform the pixel manipulation part of it.

bigcat88 commented 1 year ago

@hackerfactor can I take this corrupted image in my test's suite?

hackerfactor commented 1 year ago

@bigcat88 The corrupt image isn't mine, so I cannot give permission. It was an upload to FotoForensics.com. The ToS at FotoForensics says that all images can be used for research. As long as your test suite is research-only and private/personal (not distributed, not sold), then that's fine. But if it's part of an open source set, then that's a no.

hackerfactor commented 1 year ago

@silverbacknet wrote:

As far as I can tell, if there are any transforms beyond a simple crop, you basically have to run the entire compositing engine anyway to obtain the final canvas size, even if you don't perform the pixel manipulation part of it.

That's disappointing. Especially given the time and resources needed to decode a HEIC file. That's a lot of overhead to just find out the dimensions.

silverbacknet commented 1 year ago

It's all calculable from the metadata alone, so the full thing doesn't have to be parsed. If there are no derived images, then just start at the ispe box, and keep walking, recalculating w & h for every transformative box you encounter. The only ones that would really affect it are clap (crop), irot (rotation), and iscl (scaling). I was thinking about overlays too, but I forgot that's derived only.

If there are derived images (tiled or overlaid) then those will have the actual final size in their structs.

If there are multiple base or derived images, you have to pick which you want to show, then.

kmilos commented 1 year ago

I don't see why this should be of any value to the user, who is only interested in the final image size.

I disagree. See also https://github.com/strukturag/libheif/pull/555

hackerfactor commented 1 year ago

start at the ispe box

That's the problem. In the crash poc file, the ispe does not match the decoded image dimensions.

bigcat88 commented 1 year ago

in the python bindings i made the decision to mark such files as corrupted (when ispe shows less size than the decoded size) and later on will just add a new api that always first decodes the image and then get the image info as farindk suggests - for those who are interested in corrupted files.

Edited(19 March): it was a bad idea, some cameras store in ispe value that will be after applied rotation. So the only way to get true height and width is to get them after decoding.

bigcat88 commented 1 year ago

in the python bindings i made the decision to mark such files as corrupted (when ispe shows less size than the decoded size) and later on will just add a new api that always first decodes the image and then get the image info as farindk suggests - for those who are interested in corrupted files.

Edited(19 March): it was a bad idea, some cameras store in ispe value that will be after applied rotation. So the only way to get true height and width is to get them after decoding.

Just curious: Is this image satisfy HEIF standard?
This is an example that was submitted to me, when ispe values differs from encoded in image. Sony ILCE-7M4

hackerfactor commented 1 year ago

Is this [image](https://github.com/bigcat88/pillow_heif/raw/master/tests/images/heif_special/guitar_cw90.hif) satisfy HEIF standard? Cool! I haven't seen a HEIX file before. And 16-bit depth! Looks valid to me.

The ispe matches the image dimensions before transformations (rotation).

leo-barnes commented 1 year ago

@farindk

In my opinion, the ispe specification is bad. It (informally) specifies the raw image size before it is being transformed (rotated, cropped, ...). I don't see why this should be of any value to the user, who is only interested in the final image size. Moreover, the ispe size might vary from the actual encoded size. But we have to live with the standard as it is.

@hackerfactor

That's the problem. In the crash poc file, the ispe does not match the decoded image dimensions.

This kind of misses the point of descriptive item properties. The point of the descriptive item properties is to allow a parser to quickly determine basic properties about a file without actually having to call into specific codecs.

All transform properties and derived items tell you what the output is, assuming you know the input. So as long as you know the input, you can fully reason about all existing properties and item types.

But for a coded item you would have to call into a codec to get information about the input. This is problematic for multiple reasons:

Because of this it was decided that all image items shall have an ispe property that tells you what the dimensions of the input reconstructions are, since that is the part that is not codec-agnostic. With the ispe, a parser can reason about things like buffer dimensions needed, which jobs can be done by dedicated HW/GPU and so on, without actually having to parse the codec bitstream first.

At decode time the HEIF parser will need to validate that the output of the decoder matches the ispe and the pixi. There may very well be a mismatch, at which point the file is non-compliant and should be rejected. But these types of checks are required for all image formats. The point of the ispe and pixi is to give you a kind of contract between the container and the codec levels that can be verified and reasoned about per item.

silverbacknet commented 1 year ago

Admittedly it would have been nice if all images had to end up in with an auxC box, even if composed of just one source image with no transforms. That would simplify parsing even more, just look for the auxC (or identical clone for the master image), and it would also be useful as yet another check for validity across the entire chain.

leo-barnes commented 1 year ago

@bigcat88

Just curious: Is this image satisfy HEIF standard? This is an example that was submitted to me, when ispe values differs from encoded in image. Sony ILCE-7M4

ispe comes before transformative properties, which is technically not valid. But apart from that it seems fine. (Although having a hidden JPEG thumb is slightly weird.)