gpac / ComplianceWarden

A pluggable compliance checker (ISOBMFF, HEIF/MIAF/AVIF, AV1 HDR10+)
https://gpac.github.io/ComplianceWarden-wasm/
Other
16 stars 7 forks source link

Possible false positive for heif file extension #81

Open bradh opened 2 weeks ago

bradh commented 2 weeks ago

Describe the issue I am trying to validate a file that (I hope) is close to compliant with ISO/IEC 23008-12:2022.

The CW results include some things that are likely actual issues, but this one might be a false positive:

./bin/cw.exe -s heif ~/eofff/goprotools/parafield_gimi.heif 
+--------------------------------------+
|           heif validation            |
+--------------------------------------+

Specification description: HEIF - ISO/IEC 23008-12 - 2nd Edition N18310

[heif][Rule #21] Warning: File extension for "parafield_gimi.heif" doesn't match: expecting one of 'heic hif ', got 'heif'

To Reproduce

Steps to reproduce the behavior:

  1. CW version: Compliance Warden, version v34-master-rev25-g612a2cb
  2. I can provide the file on request, but its about ~75MB, so might need to reduce it for transfer.

Detailed analysis

I think C.2 allows heif: "File extension(s): heif (for subtype heif), heic (for subtype heic), hif (for subtypes heif and heic)"

Subtype heif should be valid for "High efficiency image file containing one or more image items using any coding format"

Potential patch

TBA

bradh commented 2 weeks ago

Possibly we could just do something like:

diff --git a/src/specs/heif/heif.cpp b/src/specs/heif/heif.cpp
index 4ccda2f..9471f43 100644
--- a/src/specs/heif/heif.cpp
+++ b/src/specs/heif/heif.cpp
@@ -1093,14 +1093,14 @@ static const SpecDesc specHeif = {
                 if(!strcmp(sym.name, "compatible_brand"))
                   switch(sym.value) {
                   case FOURCC("avci"):
-                    return { "avci" };
+                    return { "heif", "avci" };
                   case FOURCC("avcs"):
                     return { "avcs" };
                   case FOURCC("heic"):
                   case FOURCC("heix"):
                   case FOURCC("heim"):
                   case FOURCC("heis"):
-                    return { "heic", "hif" };
+                    return { "heif", "heic", "hif" };
                   case FOURCC("hevc"):
                   case FOURCC("hevx"):
                   case FOURCC("hevm"):

I'm happy to propose that as a PR if it looks generally OK.

rbouqueau commented 2 weeks ago

I remember having questions about the text at the time I wrote the code. I ended up interpreting section C as HEVC only.

First section C is named "High efficiency image ..." which looked like a reference to HEVC. In addition the references to C.1 in 1) the major brand and 2) the reference to B.2 which is HEVC specific.

And finally Section E is specifically about AVC and doesn't reference "heif" at all (NB: my spec is not the latest one).

The only element that goes in your direction is the "any coding format" in C.2.

What do you think?

@cconcolato An opinion on this?

bradh commented 2 weeks ago

I agree E does not mention heif at all (and I have the final ed 2, 2022 version).

It was the "any coding format" that made me think .heif was an acceptable suffix for this case (e.g. if you have both an AVC and a HEVC encoded image in the same file).