Closed zmroczek closed 11 months ago
Hi zmroczek,
Thank you for the bug report. decoding-broken.avif has more than one item reference box ('iref'). The ISO BMFF standard says that in each metadata box there can be zero or one item reference box.
Box Type: 'iref' Container: MetaBox Mandatory: No Quantity: Zero or one
This is why libavif 0.9.0 fails with AVIF_RESULT_BMFF_PARSE_FAILED
.
The uniqueness check was added in this commit:
commit 8f43909cf815d86220a35c7c0a12e8aa666de4fc Author: Joe Drago jdrago@netflix.com Date: Fri Aug 28 15:05:17 2020 -0700
Fixes for oss-fuzz 25266 * Ensure only one of each mandatory-unique box in a meta box exists * Ensure each item ID is cited once in an iloc box * Sanity check merged extents item size against the file size
Which program did you use to encode this image?
Thanks for the quick reply @wantehchang.
Providing more details, I just confirmed that the photo was encoded using libavif 0.6.2 from this JPEG: broken-avif.jpg.zip
When encoding the same file using libavif 0.8.1, the decoding works in 0.9.0.
edit: Do you know why in that old encoder version this specific image had multiple 'iref' boxes? It doesn't seem to apply to other images when I test, so there's something very specific about this image.
I believe this is due to this issue with metadata https://github.com/AOMediaCodec/libavif/issues/247
Hi Zuzanna,
I agree that this looks like issue https://github.com/AOMediaCodec/libavif/issues/247.
I found that the avifenc
program in libavif v0.6.2 - v0.6.4 does not support JPEG files as input. This seems to imply decoding-broken.avif was not encoded using the avifenc
program.
I tested the avifenc
program in libavif v0.7.0, which still has the bug https://github.com/AOMediaCodec/libavif/issues/247.
I found that the command line
avifenc -s 6 --min 18 --max 18 broken-avif.jpg output.avif
does not encode any 'iref' boxes in the output AVIF file. Also, Although I found the string "Exif" in broken-avif.jpg, I don't see any sign of XMP metadata in broken-avif.jpg.
To get to the bottom of this, we will need to know exactly how decoding-broken.avif was encoded.
I also decoded decoding-broken.avif using the avifdec
program in libavif v0.7.0 and confirmed that decoding-broken.avif has two 'iref' boxes and they contain Exif and XMP metadata.
@wantehchang thanks for your response.
I just verified that the XMP metadata in the output file comes from a logic outside of the encoder on copying that field from IPTC metadata in the JPEG file (since IPTC metadata is not supported for AVIF). The bitmap with metadata got passed to the encoder then and that's why the resulting image had XMP metadata.
So the problem comes from encoding images when both EXIF and XMP metadata is specified at the encoding time.
But if you choose a different image that has both XMP and EXIF and encode it with the old libavif version it will have two iref boxes and it will fail to decode on 0.9.0 - wondering if this wouldn't be a breaking change then?
Hi Zuzanna,
Yes, commit https://github.com/AOMediaCodec/libavif/commit/8f43909cf815d86220a35c7c0a12e8aa666de4fc is a breaking change. I apologize for the inconvenience.
Is it possible for you to re-encode the images that have more than one 'iref' box? Note that the alpha plane also needs to be linked to the primage image via an 'iref' box, so an image encoded with libavif v0.8.0 or older that has an alpha plane and Exif or XMP metadata is also problematic. (Commit https://github.com/AOMediaCodec/libavif/commit/3818cf4d5d29257650f9ea7e2153f52df536c88f first appeared in libavif v0.8.1.)
It should be possible to write a migration tool that merge 'iref' boxes and insert a free space box to keep file offsets unchanged. But that tool could potentially introduce errors if not done correctly. So re-encoding the images would be the safest fix.
Thanks for explaining Wan-Teh, I also noticed in the code the alpha plane to be affecting this issue but it shouldn't be applicable to the images we encoded.
For fixing the photos, ideally, if we could just modify the header without a need to re-encode those images (as it's only a header issue), that would be ideal.
I was initially thinking about just creating a patch to modify the header for those broken files (as I was able to successfully modify the broken file, based on the output of 0.6.2 with a modification I added where no two iref boxes could be added). With that additional iref box removal the offsets also get modified so it seemed complicated to create a generic solution here.
Do you think this idea with free space box you suggested would be really that error-prone? Or possibly it would be easy to only regenerate headers using 0.6.2 version with a modification I mentioned to create a file compliant with the format but still use the same encoded content from the previous broken file - or would that also be error-prone?
@zmroczek I've just committed a script (cited above) which is a proof-of-concept that shows how you could merge adjacent iref boxes and inject a free box in the reclaimed space. It is only lightly tested, but it appears to fix the file you supplied and is benign on files that don't need a fix. Instructions are written in a comment at the top of the script, and hopefully the script is simple enough to follow that you could mimic it in any fixup implementation you add to your own pipeline.
The general idea of the script is:
-n
)Obviously many of these steps can be skipped if it is your own pipeline/tool, as you know in advance that some of these things are true, or perhaps you don't need a backup, etc. The actual merge part is only a handful of lines of code, and I'm happy to explain it if it is unclear. Try it on a few of your files and let me know if it works for you.
Thank you very much @joedrago and sorry for the delay in response, I was on a holiday break.
I will check out your script and confirm all is clear and working. I really appreciate your help!
Hi @joedrago and @wantehchang
Thanks again for the help with that issue, for the time being we just currently focused on images encoded using libavif above version 0.8.1 instead but we will look into fixing those later on if needed.
In the meantime we just discovered that after switching from 0.9.0 version on this commit 6ce2d81653878e4bef5304495a2876232b1c515c to this commit 6898f303d709ae10c951d834ec43ba2b1e239052 the image published here from Netflix (link below) is also failing to decode with a header parsing error. Do you know what change in spec might be causing that? We would like to verify how this image was created and if this is going to be an issue for any images created with some specific libavif version or using some specific settings.
Link to the failing image: http://download.opencontent.netflix.com.s3.amazonaws.com/AV1/Chimera/AVIF/Chimera-AV1-8bit-1280x720-3363kbps-100.avif
All Chimera files pre-date the existence of libavif, and I believe are missing pixi
boxes. I have an open issue here:
Thanks @joedrago for the quick reply. This boosts our confidence in using the new libavif version then, thanks for clarifying!
As a quick sanity check:
If you run avifdec --no-strict --info file.avif
on the files you're having an issue with, do they decode? If so, and you're shooting for maximum compatibility with slightly-offspec files, you can always set strictFlags = AVIF_STRICT_DISABLED
on your avifDecoder
instance. This will avoid testing for clap
box validity and the presence of the pixi
box.
I believe all broken files have found an explanation and we can close this issue now.
Hi,
We have identified an AVIF image that fails to decode using libavif 0.9.0, however it was working when using 0.8.1 version.
I can dig into more information about what encoding options were used for that specific image if needed (it was generated a few months ago so it's not straightforward) but possibly you will be able to quickly identify what is so specific about this image that it fails to decode on the new version? I'm attaching the image below: decoding-broken.avif.zip