Exiv2 / exiv2

Image metadata library and tools
http://www.exiv2.org/
Other
930 stars 281 forks source link

XmpParser::encode() swallowing exceptions #1401

Open moon6969 opened 3 years ago

moon6969 commented 3 years ago

When setting an invalid Xmp key name with a unicode character like this... xmpData["Xmp.xmp.Rating↕"] = 69; No exception is thrown at the above line, but the subsequent image->writeMetadata(); is giving some stderr (since debug build)...

Error: XMP Toolkit error 102: Bad XML name
Error: Failed to encode XMP metadata.

Looking at exiv2-0.27.3 xmp.cpp, the exception is being caught but not re-thrown here or in the caller JpegBase::doWriteMetadata().

Is this intentional? How can I catch "XMP_Error"s in my application?

Contrast this with setting a badly formed Xmp key name which get a 'kerInvalidKey' exception thrown by the assignment... xmpData["Rating"] = 69;

clanmills commented 3 years ago

Nobody can answer the question "Is this intentional" because the person who implemented that left the project before I joined in 2008.

However I think the exception should be caught in xmp.cpp and converted to an Exiv2::Error just as done here.

    void XmpParser::registeredNamespaces(Exiv2::Dictionary& dict)
    {
        bool bInit = !initialized_;
        try {
            if (bInit) initialize();
            SXMPMeta::DumpNamespaces(nsDumper,&dict);
            if (bInit) terminate();
        } catch (const XMP_Error& e) {
            throw Error(kerXMPToolkitError, e.GetID(), e.GetErrMsg());
        }
    }

I don't think you can catch exceptions thrown by XMP as they are usually "absorbed" (caught) in xmp.cpp.

When your call image->writeMetadata(), your image handler calls his writeMetadata() (in JpegBase) and the different metadata families (Exif, IPTC and XMP) are serialised and written to temporary storage. If nothing bad happens, the temporary storage is then copied into your valuable image.

I believe the serialization of XMP happens in XmpParser::encode() which should have caught the 102 (from XMPsdk) and thrown an Exiv2::Error(). However looking at the code, I see:

int XmpParser::encode(      std::string& xmpPacket,
                          const XmpData&     xmpData,
                                uint16_t     formatFlags,
                                uint32_t     padding)
    { try {
 ........ 
    catch (const XMP_Error& e) {
        EXV_ERROR << Error(kerXMPToolkitError, e.GetID(), e.GetErrMsg()) << "\n";
        return 3;
    }
 }

Looks to me as though he is returning 3 when there's serious trouble down below. I don't remember what's happening in the image handler's writeMetadata(). He might understand the 3, or there again, he might ignore it.

A good first step would be to turn on the more debugging by putting the following near the front of src/xmp.cpp

#ifdef SUPPRESS_WARNINGS
#undef SUPPRESS_WARNINGS
#endif 

We're in for a little searching and debugging here. However I'm 100% confident of success. Together, we'll defeat this puppy. (no insult intended to dogs, seals or other creatures: dead or alive).

piponazo commented 3 years ago

I agree with you analysis @clanmills . The documentation for writeMetadata() says:

        /*!
          @brief Write metadata back to the image.

          All existing metadata sections in the image are either created,
          replaced, or erased. If values for a given metadata type have been
          assigned, a section for that metadata type will either be created or
          replaced. If no values have been assigned to a given metadata type,
          any exists section for that metadata type will be removed from the
          image.

          @throw Error if the operation fails
         */
        virtual void writeMetadata() =0;

At the moment we are not checking the error codes in many of the internal calls from writeMetadata(), so it would be better to propagate the Exiv2::Error

moon6969 commented 3 years ago

Thanks both - consistently propagating Exiv2::Error definitely seems the way to go.

I suspect it will make some users of the library squirm though if exceptions start popping up where it used to fail silently.

I'll get a Exiv2 dev environment setup and poke about a bit further.

clanmills commented 3 years ago

I agree with @moon6969 that we should be cautious about changing this on 0.27-maintenance. This issue has been in the code for about 15 years without discussion. So doing nothing isn't likely to upset anybody. However if we break a user's working code, we will seriously upset somebody.

So let's fix this on 'master' and add something to the test suite (based on your report). If another 0.27-maintenance user reports something about we can:

  1. Refer them to this discussion.
  2. Invite them to apply the 'master' fix/patch to their local copy of library.

I look forward to your PR!

moon6969 commented 3 years ago

Exiv2::XmpParser::decode() API (and similarly for encode) states:

Returns 0 if successful; 1 if XMP support has not been compiled-in; 2 if the XMP toolkit failed to initialize; 3 if the XMP toolkit failed and raised an XMP_Error

So xmp.cpp is mainly adhering to the spec, but losing valuable diagnostic information.

Looking at Exiv2 code, the main readMetadata & writeMetadata implementations ignore the return code (other than logging a warning if enabled). Other calls from things like samples/xmpparse.cpp raise an exception if non-zero is returned.

I've not had much experience with the intricacies of API design, but I thought to return the actual XMP_Error code instead of just '3', but it would require some offset or remap cheese since {0, 1, 2} are defined error codes... From exiv2/xmpsdk/include/XMP_Const.h...

    /// Generic unknown error
    kXMPErr_Unknown          =   0,
    /// Generic undefined error
    kXMPErr_TBD              =   1,
    /// Generic unavailable error
    kXMPErr_Unavailable      =   2,
    /// Generic bad object error
    kXMPErr_BadObject        =   3,
    /// Generic bad parameter error
    kXMPErr_BadParam         =   4,
    /// Generic bad value error

So how about adding an "ignoreErrors" (default true) parameter to XmpParser::decode & XmpParser::encode ? Although the readMetadata & writeMetadata API already indicate that exceptions may occur, it may worth considering an "ignoreXMPerrors" parameter for those too.

clanmills commented 3 years ago

Adding bool ignoreErrors=true will change the API. That'll be OK for 0.28 for which we have declared that the API will change. I discuss API/ABI management in my book: https://clanmills.com/exiv2/book/#9

For 0.27-maintenance, it would be better to add a new API Exiv2::XmpParser::decodeOrThrow(), however I think a safer course in 0.27-maintenance is to do nothing about this.

What samples/xmpparse.cpp decides to do with the return != 0 is his concern. Curious, however not of much importance.

So, how about on master we:

  1. Retain the existing API (returns 0..3)
  2. Add the API Exiv2::XmpParser::decodeOrThrow()

Could you investigate the other functions in XmpParser and consider what else can/should be done.

761 rmills@rmillsmbp:~/gnu/github/exiv2/0.27-maintenance $ nm -g --demangle build/lib/libexiv2.dylib | grep XmpParser | cut '-d ' -f 3- 
Exiv2::XmpParser::initialize(void (*)(void*, bool), void*)
Exiv2::XmpParser::pLockData_
Exiv2::XmpParser::registerNs(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&)
Exiv2::XmpParser::xmpLockFct_
Exiv2::XmpParser::initialized_
Exiv2::XmpParser::unregisterNs(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&)
Exiv2::XmpParser::registeredNamespaces(std::__1::map<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::less<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > >&)
Exiv2::XmpParser::decode(Exiv2::XmpData&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&)
Exiv2::XmpParser::encode(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, Exiv2::XmpData const&, unsigned short, unsigned int)
Exiv2::XmpParser::terminate()
762 rmills@rmillsmbp:~/gnu/github/exiv2/0.27-maintenance $ 

@piponazo Thoughts?


@moon6969 I saw you entered our "Elements" room yesterday. As you noticed, you could easily die of boredom in there.

It's Friday afternoon. I'm going to update README-CONAN.md this afternoon and solve a puzzle in the book's tvisitor.cpp code concerning AVIF/ISOBMFF for a user in Prague. And then it's time for Friday dinner (steak and red wine).

moon6969 commented 3 years ago

@clanmills Like your thinking - will have a stab at it over the weekend.

Exiv2::XmpParser::encode needs the same encodeOrThrow() treatment as decode.

The only other xmp.cpp function that needs some error propagation love... Exiv2::XmpParser::registerNs has a catch but the throw appears to have been commented out by @clanmills in 2017... "MacOSX --with-adobe link issue" (https://github.com/Exiv2/exiv2/commit/aefa3e3d341392c1d3472034e7f4187b9b896917) Any recollection of this!? registerNs is not listed in the Exiv2::XmpParser API, and only appears to be called from XmpProperties::registeredNamespaces and XmpParser::encode... both of which could handle the error if required. So I propose to uncomment it, and ensure these two calls handle it as required.


RE: Elements. It might be worth plugging the Matrix chatroom specifically in the old Exiv2 forum Welcome to the Exiv2 forum! sticky... it indicates issues should be raised on Github, but doesn't indicate if the Exiv2 forum is alive or not.

Similarly, section 3.2 Support of this respositories README.md could have a link. Initially I didn't notice the Matrix link at the end of Travis | AppVeyor |  .... etc.


Right. Now I'm off to M&S to get some :cut_of_meat: & :beer:

clanmills commented 3 years ago

I might have commented off the throw in registerNs() because it was causing trouble in the test suite. I can't remember. Restore it and see what happens.

Having returns and throws as different APIs feels good to me:

API with return API with throw
XmpParser::encode() XmpParser::encodeOrThrow()
XmpParser::decode() XmpParser::decodeOrThrow()
XmpParser::registerNs() XmpParser::registerNsOrThrow()

I know what the beer is. Is that a Santa Hat? M&S? You're also in England?

Somebody else opened the Element Chat Room and we've hardly used it. I really don't want to encourage the chat room for bugs and issues. Chat rooms are for trivial chatter (family, hobbies).

Dan and Luis and I used to talk almost every day on Slack. I retired from Exiv2 in October 2019. I decided to return to the project in March 2020 to do v0.27.3. Slack had been replaced by Element (Matrix/Rottweiler) and nobody was ever home.

After I did v0.27.3, I decided to write the book. When that's finished at the end of 2020, I can't do anything more for Exiv2. I'll be 70 in January. https://github.com/Exiv2/exiv2/issues/1018#issuecomment-694383754

moon6969 commented 3 years ago

I'm stuck in London for the duration. Your steak comment struck a chord... 🥩 | :cut_of_meat:

clanmills commented 3 years ago

Cut of meat. Ah right. :wine_glass:

moon6969 commented 3 years ago

Just noticed in xmp_exiv2.hpp that XmpParser::registerNs is private and returns void...

private:
        ...
        static void registerNs(const std::string& ns,
                               const std::string& prefix);

So I think no need for registerNsOrThrow().

moon6969 commented 3 years ago

I think it makes sense to update all the different writeMetadata() calls to use encodeOrThrow() - if you are trying to save some information and it fails you need to know.

I'm not so sure for readMetadata()... if reading the xmp data fails, is there not a case to be made to ignore the exception so any other types of metadata are returned?

clanmills commented 3 years ago

I don't think there's an answer to your question. For sure readMetadata() will throw if you give him a fuzzed file. So when the data is really nasty, we throw. However it's common (especially in Nikon images) to encounter illegal Exif metadata to issue a warning and continue.

https://exiv2.org/doc/classExiv2_1_1Image.html#a198b8d5924d6441748aa162130c96a5f

If we encounter an error on writeMetadata(), we should throw. It's never a good idea to risk corrupting the users image.

All of this applies to Exiv2 'master' v0.28 and I'm not involved in that project.

moon6969 commented 3 years ago

Thanks Robin - appreciate the input. I'll work on it.

moon6969 commented 3 years ago

I've raised #1405 for readMetadata as I believe it needs much more design work than the writeMetadata changes discussed above.

clanmills commented 3 years ago

This looks like a tough puzzle. I'll set the milestone to v1.00. Setting the milestone means it will be considered for fixing. It doesn't guarantee that a contributor will take on the challenge.