Exiv2 / exiv2

Image metadata library and tools
http://www.exiv2.org/
Other
914 stars 279 forks source link

Memory leak issue with exifprint 0.27.2 #976

Closed tester0077 closed 4 years ago

tester0077 commented 5 years ago

As part of one of my projects, a fork of the Sourcforge vvvP project which seems almost dormant, I need to use exiv2lib to display images. I am using an updated version MSVC 2017, Community edition, wxWidgets 3.0.1 all statically linked - except for zlib1.dll.

The original code was compiled with the DLL version of exiv2 0.27.1. But I was getting memory leaks - as reported both by my own MSVC code and Visual Leak Detector (VLD). Unfortunately, neither was able to give me details as to where exactly the memory was allocated, though I was pretty sure it was somewhere in the exiv2lib code from running test which did not invoke exiv2lib - though that meant I could not display the images. Therefore I downloaded the latest exiv2 0.27.2 code and compiled a static 32-bit debug version of exiv2lib. (Kudos to the developers because, by following the instructions, it all compiled without a hitch.)

After some fiddling to reverse my patches necessary for the 0.27.1 version to avoid using the boost libraries, I got my forked project to compile, link and run. As before, I was getting a lot of memory leaks if I displayed any one of the images, but by instrumenting the exiv2lib source files with my memory detection code (from MSVC) quickly identified the issue: the places identified directly were all within XMP::Initialize(), though this function seemed to be called in a very round about way (as part of other initializations, I presume/guess).

A review of the exiv2lib documentation and exvi2 samples shows lots of use of the XMP SDK, but as far as I can tell, none of them definitely and directly initialized the XMP SDK code - and I cannot find any references to any cleanup for the SDK.

After compiling a debug version of exifprint (as a test & with VLD installed and an #include "vld.h" line at the top of exifprint.cpp), handing it a test image from https://en.wikipedia.org/wiki/File:An_exif_test_image_-_for_test_use_(not_article_space)_use_only.jpg and setting a breakpoint at XMPMeta::Initialize() in XMPMeta.cpp, it ended up at this initialization point. At exit, however, VLD printed out a long list of leaked memory allocations - some of which included information about where the memory was allocated.

Setting another breakpoint at XMPMeta::Terminate() - also in XMPMeta.cpp - showed that this cleanup routine was never called on exit.

For now, I will have to live with this issue, but I would dearly see it resolved. If I missed anything about where the cleanup might happen, or if I can provide further details, please let me know

clanmills commented 5 years ago

Good Morning, Arnold. I was thinking about you yesterday. I released v0.27.2 on Monday. Yesterday I closed the v0.27.2 project on GitHub and looked at what's waiting for me in 0.27.3. I thought "I wonder if Arnold will let me have a day off before asking me about #918"!

Using our samples (for example samples/exifprint.cpp) for testing is good practice. It would help if you shared your code with me, then we'll be "on the same page".

Are you calling XMPInitialize() directly? Do you call XMPInitiaize() more than once? Exiv2 doesn't has neither library Initiaize() nor Terminate() functions, so we can't call XMPTerminate() for you as we don't know you are terminating! If your code is samples/exifprint.cpp you may have to define an atexit() function and call XMPTerminate() from there. Libraries should never use atexit() as this is a process wide function which should be managed directly from main() (or one of his descendants).

tester0077 commented 5 years ago

Hello Robin,

I had been following the exiv2 progress with some interest although porting the new project to my environment kept me pretty busy.

As I said, partly because of previous experience, but also from changes made by the developers, I found the installation for the latest version much more straight forward.

On 2019-07-30 11:06 p.m., Robin Mills wrote:

Good Morning, Arnold. I was thinking about you yesterday. I released v0.27.2 on Monday. Yesterday I closed the v0.27.2 project on GitHub and looked at what's waiting for me in 0.27.3. I thought "I wonder if Arnold will let me have a day off before asking me about #918 https://github.com/Exiv2/exiv2/issues/918"!

Well, you know what they say: "No rest ..... " ;-)

918 is still an issue, but I resolved it the same way as before,

replacing  all relevant places with free_

D:\pkg\C++\MSVC2017\exiv2-master-0.27.2\exiv2-0.27.2-Source\include\exiv2\types.hpp(248): void free(); D:\pkg\C++\MSVC2017\exiv2-master-0.27.2\exiv2-0.27.2-Source\src\image.cpp(700): iccProfile.free(); D:\pkg\C++\MSVC2017\exiv2-master-0.27.2\exiv2-0.27.2-Source\src\pngimage.cpp(138): result.free(); D:\pkg\C++\MSVC2017\exiv2-master-0.27.2\exiv2-0.27.2-Source\src\pngimage.cpp(145): result.free(); D:\pkg\C++\MSVC2017\exiv2-master-0.27.2\exiv2-0.27.2-Source\src\pngimage.cpp(166): result.free(); D:\pkg\C++\MSVC2017\exiv2-master-0.27.2\exiv2-0.27.2-Source\src\pngimage.cpp(169): result.free(); D:\pkg\C++\MSVC2017\exiv2-master-0.27.2\exiv2-0.27.2-Source\src\types.cpp(205): void DataBuf::free()

Using our samples (for example /samples/exifprint.cpp/) for testing is good practice. It would help if you shared your code with me, then we'll be "on the same page".

The sample code is simply exifprint with the addition of the vld header

// ***** -- C++ -- // exifprint.cpp // Sample program to print the Exif metadata of an image

include <exiv2/exiv2.hpp>

include

include

include

include "vld.h" //<<<<<<<<<<<<<<<<<<<<<<<<<<

.......

and an installation of the Visual Leak Detector from https://kinddragon.github.io/vld/

my version is 2.5.1 - the current one from the above repository.

I forced it to install to a path without spaces to avoid issues because of the spaces, hence it resides on my work PC at: C:\VisualLeakDetector

The sample I used is the one installed from the current 0.27.2 installation following the instructions given, using conan and CMake to build the MSVC .sln file then compiling the works and instrumenting exifprint as mentioned above

Are you calling |XMPInitialize()| directly? Do you call |XMPInitiaize()| more than once? Exiv2 doesn't has neither library |Initiaize()| nor |Terminate()| functions, so we can't call XMPTerminate() for you as we don't know you are terminating! If your code is |samples/exifprint.cpp| you may have to define an |atexit()| function and call |XMPTerminate()| from there. Libraries should never use |atexit()| as this is a process wide function which should be managed directly from |main()| (or one of his descendants).

My current issue is simply with exifprint, though I (we :-)) will have to eventually sort out how to handle this problem for exiv2lib to make it usable.

In one of my projects I had worked with the Adobe XMP SDK directly, hence I was aware of the need to both initialize & cleanup when using the SDK, but since I saw no mention of either for the exiv2 lib code, I assumed it was all done 'behind the scenes', just as I found the initialization code was called. When I could not figure out how the cleanup was called, I simply went back to one of the exiv2 samples and used it to check.

Including the VLD header file and recompiling, showed me that there was leakage. After that, even removing the VLD header and simply placing a breakpoint at the cleanup routine, verified for me that that code was not being called.

For some apps, such as the exifprint sample, any leakage is not very serious, but if the library is used and called often enough, it eventually might be a problem. I have not been able to verify whether the leakage is confined to the items expected to be released by the cleanup code, or if there might be other blocks not freed.

If it is simply the 'overhead' released in the cleanup code, then it might possible be a relatively small and fixed amount of memory. Still, I would feel much better if I could resolve all leakages.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Exiv2/exiv2/issues/976?email_source=notifications&email_token=ACFCLPE5IUA5KS6TDFE4G23QCETVVA5CNFSM4IIBJ73KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3GF3WQ#issuecomment-516709850, or mute the thread https://github.com/notifications/unsubscribe-auth/ACFCLPDG2EMO2VS7WIT2LW3QCETVVANCNFSM4IIBJ73A.

clanmills commented 5 years ago

Arnold. You have my attention and this matter (and #918) will be investigated.

Thank you for your nice words of "kudos" about the installation procedure. I aim to make our documentation comprehensive, correct and not too long. I may have recruited a tech writer to take ownership of them going forward.

tester0077 commented 5 years ago

Although it made some work for myself to revert the changes to not use the boot libraries, I was very pleased to see that the need for boost had been eliminated. Also, I just now installed/setup a similar setup for a 32-bit statically linked release version and it too worked like a charm. As a side note: My interest in exiv2 is not confined to making my current project use exiv2 lib, but also because of my interest in Gramps and its use/display of image metadata using pyexiv2, which I understand is also based on libexiv2

clanmills commented 5 years ago

Exiv2 has never used boost, Arnold. There's a contrib item contrib/organize/organize.cpp that uses boost. However none of the "official" Exiv2 code requires boost.

Have you investigated using atexit() to invoke XMPTerminate() in exifprint.cpp ?

pyexiv2 is no longer supported by Oliver and has never been supported by Exiv2. There is a py3exiv2 which is 'pyexiv2 for python3'. A couple of years ago, I helped a user in NZ build py3exiv2 on CentOS. https://dev.exiv2.org/projects/exiv2/wiki/BuildingPy3Exiv2onCentos

py3exiv2 would not build on MacOS-X or Visual Studio. I wrote to the author without response. I know why it doesn't build on MacOS-X and Visual Studio and it's non-trivial to fix it, otherwise I would have sent a patch.

Incidentally, I started working with Exiv2 in 2008 and discovered pyexiv2. I ported both Exiv2 and pyexiv2 to Visual Studio (2003/XP at the time) and MacOS-X "Leopard" (10.5) My reason to be interested in Exiv2 was to geotag photos. I've subsequently ported my python code to samples/geotag.cpp and no longer use pyexiv2.

There are notes on my web-site which were written in 2008. You may find them of some help. However I don't have time to support this and/or answer questions about pyexiv2 nor py3exiv2. https://clanmills.com/articles/gpsexiftags/default-2011.shtml

tester0077 commented 5 years ago

Yes I have had a go at atexit() with exifprint, but I cannot find any variation of calling XMPMeta::Terminate that the compiler will accept.

As for pyexiv & Gramps, I have been sidetracked by sorting out collecting and annotating my data to be used by Gramps - eventually. Also, had brief look at the link - there is a mention of 'boost' libraries needed for even pyexiv2 :-( :-( I just checked and the latest edition still uses pyexiv2 - guess that will be another bridge to cross when I get to it :-(

clanmills commented 5 years ago

Ah, yes. You need boost to build pyexiv2. However that's not part of Exiv2.

I've started/terminated XMPsdk with:

#ifdef EXV_HAVE_XMP_TOOLKIT
    Exiv2::XmpParser::initialize();
#endif
#ifdef EXV_HAVE_XMP_TOOLKIT
    Exiv2::XmpParser::terminate();  
#endif

Can you test this code please and see if this fixes the leak.

// ***************************************************************** -*- C++ -*-
// exifprint.cpp
// Sample program to print the Exif metadata of an image

#include <exiv2/exiv2.hpp>

#include <iostream>
#include <iomanip>
#include <cassert>

// https://github.com/Exiv2/exiv2/issues/468
#if defined(EXV_UNICODE_PATH) && defined(__MINGW__)
#undef  EXV_UNICODE_PATH
#endif

#ifdef  EXV_UNICODE_PATH
#define _tchar      wchar_t
#define _tstrcmp    wcscmp
#define _t(s)       L##s
#define _tcout      wcout
#define _tmain      wmain
#else
#define _tchar      char
#define _tstrcmp    strcmp
#define _t(s)       s
#define _tcout      cout
#define _tmain      main
#endif

int _tmain(int argc, _tchar* const argv[])
try {
    const _tchar* prog = argv[0];
    const _tchar* file = argv[1];

#ifdef EXV_HAVE_XMP_TOOLKIT
    Exiv2::XmpParser::initialize();
#endif

    if (argc != 2) {
        std::_tcout << _t("Usage: ") << prog << _t(" [ file | --version || --version-test ]") << std::endl;
        return 1;
    }

    if ( _tstrcmp(file,_t("--version")) == 0 ) {
        exv_grep_keys_t keys;
        Exiv2::dumpLibraryInfo(std::cout,keys);
        return 0;
    } else if ( _tstrcmp(file,_t("--version-test")) == 0 ) {
        // verifies/test macro EXIV2_TEST_VERSION
        // described in include/exiv2/version.hpp
        std::cout << "EXV_PACKAGE_VERSION             " << EXV_PACKAGE_VERSION             << std::endl
                  << "Exiv2::version()                " << Exiv2::version()                << std::endl
                  << "strlen(Exiv2::version())        " << ::strlen(Exiv2::version())      << std::endl
                  << "Exiv2::versionNumber()          " << Exiv2::versionNumber()          << std::endl
                  << "Exiv2::versionString()          " << Exiv2::versionString()          << std::endl
                  << "Exiv2::versionNumberHexString() " << Exiv2::versionNumberHexString() << std::endl
                  ;

        // Test the Exiv2 version available at runtime but compile the if-clause only if
        // the compile-time version is at least 0.15. Earlier versions didn't have a
        // testVersion() function:
        #if EXIV2_TEST_VERSION(0,15,0)
            if (Exiv2::testVersion(0,13,0)) {
              std::cout << "Available Exiv2 version is equal to or greater than 0.13\n";
            } else {
              std::cout << "Installed Exiv2 version is less than 0.13\n";
            }
        #else
              std::cout << "Compile-time Exiv2 version doesn't have Exiv2::testVersion()\n";
        #endif
        return 0;
    }

    Exiv2::Image::AutoPtr image = Exiv2::ImageFactory::open(file);
    assert(image.get() != 0);
    image->readMetadata();

    Exiv2::ExifData &exifData = image->exifData();
    if (exifData.empty()) {
        std::string error("No Exif data found in file");
        throw Exiv2::Error(Exiv2::kerErrorMessage, error);
    }

    Exiv2::ExifData::const_iterator end = exifData.end();
    for (Exiv2::ExifData::const_iterator i = exifData.begin(); i != end; ++i) {
        const char* tn = i->typeName();
        std::cout << std::setw(44) << std::setfill(' ') << std::left
                  << i->key() << " "
                  << "0x" << std::setw(4) << std::setfill('0') << std::right
                  << std::hex << i->tag() << " "
                  << std::setw(9) << std::setfill(' ') << std::left
                  << (tn ? tn : "Unknown") << " "
                  << std::dec << std::setw(3)
                  << std::setfill(' ') << std::right
                  << i->count() << "  "
                  << std::dec << i->value()
                  << "\n";
    }

#ifdef EXV_HAVE_XMP_TOOLKIT
    Exiv2::XmpParser::terminate();
#endif

    return 0;
}
//catch (std::exception& e) {
//catch (Exiv2::AnyError& e) {
catch (Exiv2::Error& e) {
    std::cout << "Caught Exiv2 exception '" << e.what() << "'\n";
#ifdef EXV_HAVE_XMP_TOOLKIT
    Exiv2::XmpParser::terminate();
#endif
    return -1;
}
tester0077 commented 5 years ago

Thank you, Robin :-) Including some of the code at the end of exifprint , namely

.........
#ifdef EXV_HAVE_XMP_TOOLKIT
    Exiv2::XmpParser::terminate();
#endif
    return 0;
}
//catch (std::exception& e) {
//catch (Exiv2::AnyError& e) {
catch (Exiv2::Error& e) {
    std::cout << "Caught Exiv2 exception '" << e.what() << "'\n";
#ifdef EXV_HAVE_XMP_TOOLKIT
    Exiv2::XmpParser::terminate();
#endif
    return -1;
}

Cleaned up all of the leaks and VLD reports

Visual Leak Detector read settings from: C:\VisualLeakDetector\vld.ini Visual Leak Detector Version 2.5.1 installed. Exif.Image.XResolution 0x011a Rational 1 72/1 Exif.Image.YResolution 0x011b Rational 1 72/1 Exif.Image.ResolutionUnit 0x0128 Short 1 2 Exif.Image.YCbCrPositioning 0x0213 Short 1 1 Exif.Image.Copyright 0x8298 Ascii 18 Copyright Cthulhu No memory leaks detected. Visual Leak Detector is now exiting.

D:\pkg\C++\MSVC2017\exiv2-master-0.27.2\exiv2-0.27.2-Source\build32DebugStatic\bin\exifprint.exe (process 15128) exited with code 0.

The initial section does not hurt, but is not strictly necessary, but without the last entry, just before return 0; VLD reports - for the small image I had mentioned - at the tail end of a long list: ......

---------- Block 1061 at 0x00061488: 8 bytes ---------- Leak Hash: 0x647C6D80, Count: 1, Total 8 bytes Call Stack (TID 15652): Data: CC FC 05 00 00 00 00 00 ........ ........

---------- Block 1052 at 0x00061568: 8 bytes ---------- Leak Hash: 0xCC60D2D1, Count: 1, Total 8 bytes Call Stack (TID 15652): Data: 08 F0 05 00 00 00 00 00 ........ ........

Visual Leak Detector detected 588 memory leaks (40024 bytes). Largest number used: 81051 bytes. Total allocations: 116352 bytes. Visual Leak Detector is now exiting. The program '[0x22EC] exifprint.exe' has exited with code 0 (0x0).

tester0077 commented 5 years ago

And I just tried the same code in my application and I am more than pleased to say that it cleaned up the memory leaks!

clanmills commented 5 years ago

OK. I'm going to submit a PR to add this code to all our sample applications. And I'll add a section to README.md and README-SAMPLES.md to explain what/why this is used.

Andreas (the founder of Exiv2) wanted to avoid init/term code in the Exiv2 library and indeed it's not needed by his Exif code.

XmpParser::initialize() is called by readMetadata()/xmpDecode() to ensure that the XMPsdk is initialised. However, as I mentioned, libexiv2 has no way to know that the process is terminating.

As a matter of style, if you call Exiv2::XmpParser:: initialize(), you should call Exiv2::XmpParser::terminate(). It's ugly to rely on automagic to call initialize() then explicitly call terminate().

When I submit the PR for this change, I'm sure Dan and Luis will comment. For v0.27, we are avoiding API changes, so the above code is fine. For v0.28, they may prefer to add APIs Exiv2::initialize() and Exiv2::terminate() to mask this while providing a code location to deal with other process-wide resources such as locks.

I believe Xmp::Parser::initialize() is not thread-safe and should be called before spinning additional threads. Similarily, Xmp::Parser::terminate() should be called after all threads have terminated.

tester0077 commented 5 years ago

Some of those reservations where at the back of my mind, but being very unfamiliar with the rest of the code and the design ideas behind it I was even surprised that calling the XmpParser terminate would do the trick. Still, it seems to me that other libraries, such as libcurl, I have used do insist on bracketing their use with some calls to both initialize and cleanup when they are used. For now, I am very happy :-)

clanmills commented 5 years ago

I'm glad that you're happy. Remember, I'm only the messenger and most of the code in Exiv2 was written by other people. I don't have any concerns about library init()/term() functions. However Andreas thought otherwise.

There's an elegant balance in Exiv2::XmpParser:: initialize() and Exiv2::XmpParser::terminate() and requires no API changes.

I'll investigate your _free() #918 issue in the next few days. Don't forget I'm a volunteer. I have other matters in life on which to spend my time! Exiv2 v0.27.2 had three RCs. So Exiv2 took up a lot of time in June and July.

tester0077 commented 5 years ago

I quite understand the time constraints. I saw the flurry of messages back and forth for 0.27.2 and was very happy to not have to really reflect on any of the decisions behind it all - and I love the results :-) The only reason I am recording my results right now is because by tomorrow I may be off to work on other issues and will have forgotten many of the details that seemed important today :-)

918 is not that important for me because I expect to be using 0.27.2 for the next little while until a new release and my code here is patched and working thanks to your help.

And the 'work-around' for exiv2lib fits into that same category.

clanmills commented 5 years ago

Thanks for your comments and closing this issue.

I'm going to reopen it. When I submit the PR for the code and document changes, it'll be automatically closed when the PR is merged into 0.27-maintenance.

I'd also like to keep this open as the Team will probably want to merge this into 'master' which will be 0.28 eventually. There's no solid schedule for 0.28 at the moment.

tester0077 commented 5 years ago

Sounds good. Thank you