felixc / rexiv2

Rust library for read/write access to media-file metadata (Exif, XMP, and IPTC)
GNU General Public License v3.0
79 stars 17 forks source link

HEIC (BMFF) files have no EXIF tags without calling `gexiv2_initialize` #48

Closed NoxHarmonium closed 2 years ago

NoxHarmonium commented 2 years ago

Hi there,

First, thanks for the library! It's a great help when I want to write scripts in rust to process my photo tags.

I've spent a while working out why I couldn't read tags from HEIC files that I copied off my iPhone and I think I've figured it out, but I want to make sure that I'm not just confused. It might be a good idea to update the documentation to make some things clearer (if I am understanding things correctly) so other people don't fall into the same trap.

The first thing I checked was that the versions of gexiv2 and exiv2 on my machine supported HEIC (ISOBMFF) files. It turns out they did. So the next thing I tried was writing a little bit of hacky c code to triple check that the right versions were getting picked up etc.

#include <stdio.h>
#include <gexiv2/gexiv2.h>

int main()
{
    GExiv2Metadata *metadata = gexiv2_metadata_new();
    gboolean success = gexiv2_metadata_open_path(metadata, "/tmp/IMG_0187.HEIC", NULL);
    printf("did it work? %i \n", success);
    char **tags = gexiv2_metadata_get_exif_tags(metadata);
    char *tag;
    int tagIndex = 0;
    while (tag = tags[tagIndex++])
    {
        printf("Tag: %s\n", tag);
    }

    return 0;
}

It still wasn't working. There was no error code, but the tags array was empty, which was the same issue I was getting with rexiv2. After banging my head against a wall for a few hours, I came across the function gexiv2_initialize in the gexiv2 API documentation (https://gnome.pages.gitlab.gnome.org/gexiv2/docs/gexiv2-Library-initialisation.html) After adding a call to that, I could suddenly read EXIF tags from my HEIC files.

I had the same behaviour with the rust code too:

    // Without this this line, get_exif_tags returns an empty array. 
    rexiv2::initialize().expect("Unable to initialize rexiv2");
    rexiv2::Metadata::new_from_path("/tmp/IMG_0187.HEIC");
    println!(
        "{:#?}",
        exif.get_exif_tags(),
    );

According to the gexiv2 documentation it should always be called before calling any of the other functions. However, in the rexiv2 codebase, the comments mention that it only needs to be called in a multithreaded context. None of the example documentation I can see references it either.

https://github.com/felixc/rexiv2/blob/93ac3fcf3824e49f3fcf60f45df6859e64421707/src/lib.rs#L1022-L1040

I find it pretty confusing that the gexiv2 library works at all without calling that function considering that the gexiv2 API docs say to "not to use any Gexiv2 code until [gexiv2_initialize] has returned" which makes me think I'm doing something wrong.

If it makes sense that gexiv2_initialize should be called before running any code, I'd be happy to raise a PR to amend some of the documentation.

Let me know what you think.

Thanks,

Sean

felixc commented 2 years ago

Hi!

Thank you for this fantastic report! It's extremely clear and helpful, and the example code and debugging that you did are impressive.

I was really glad to read it, because it cleared up a big mystery for me: I never actually knew what that initialize function was for, and had frankly forgotten about it! Like you also discovered, I had found that all my testing and use worked just fine without ever calling initialize, so I never knew why it was important or useful. I don't have any HEIC images so this scenario never came up for me. That's probably why the docs are unclear and why the example code forgets to use it :sweat_smile:

Looking at the source for gexiv2_initialize it appears that it calls Exiv2::enableBMFF. I don't know what "BMFF" is, but the Exiv2 docs indicate it is indeed what's needed for HEIC files: https://github.com/Exiv2/exiv2#support-for-bmff-files-cr3-heif-heic-and-avif

Support for bmff files (CR3, HEIF, HEIC, and AVIF) [...] [...] the application must enable bmff support at run-time by calling the following function. EXIV2API bool enableBMFF(bool enable);

So, I guess that's what makes it work, and why initialize is useful!

I definitely agree that the current docs are inadequate. I'm not sure if when I wrote them I believed that it was only relevant to multithreaded applications, or if I meant it to be read as "you need to call this — and if in a multithreaded environment, the call should be done in a threadsafe manner". Either way, the latter is definitely what it sounds like, which is wrong. It should presumably say something similar to what the gexiv2 docs you linked to say, and the example and test code should be updated to include the call.

If you're up for sending a PR, that would be very welcome — otherwise I'll do it in the next week or so.

I also wonder if any Rust wizards out there know of a way to make that call happen automagically/transparently the first time (and only the first time) any of the new* methods are called...

Thanks again!

NoxHarmonium commented 2 years ago

No worries! I'm on the road at the moment so I might not get a chance until next week, but definitely happy to get a PR up myself.

NoxHarmonium commented 2 years ago

Sorry for the delay! I finally got the PR up 😅

https://github.com/felixc/rexiv2/pull/49

Let me know if I've missed anything.

felixc commented 2 years ago

That's awesome, thank you for the contribution!