LeoHsiao1 / pyexiv2

Read and write image metadata, including EXIF, IPTC, XMP, ICC Profile.
GNU General Public License v3.0
201 stars 39 forks source link

read_xmp Crashes python without exception #3

Closed auphofBSF closed 4 years ago

auphofBSF commented 4 years ago

A very good little module that works well but is crashing python on a read_xmp with a .jpg file straight from Digikam. However periodically it runs through successfully on this file. It appears totally random as to which rerun it will work on. No Exception is raised, python just terminates..... This feels similar to the #2 raised and closed

Environment details below and jpg exhibiting issue is attached. Note. The modify_xmp() is commented out to not alter the jpg for debugging the read_xmp

from pyexiv2 import Image
imageFile = r"D:\tmp\19W38\pyexiv2\pyexiv2ISSUE19w38d1aMod2.jpg"
image = Image(imageFile)
xmpDict = image.read_xmp()

print('Image xmp was succesfully read -----------------')
print(xmpDict['Xmp.digiKam.TagsList'])
print(xmpDict['Xmp.dc.subject'])

xmpDictNew = xmpDict.copy()
xmpDictNew['Xmp.dc.subject'] = xmpDict['Xmp.digiKam.TagsList']
#image.modify_xmp(xmpDictNew)

The jpg exhibiting this issue is in a branch of my fork. https://github.com/auphofBSF/pyexiv2/commit/3890a83a3fa2c415aba23bbc2d4dbfcdc8325ef5

Environment is Windows 10 Python 3.5.6 |Anaconda, Inc.| (default, Aug 26 2018, 16:05:27) [MSC v.1900 64 bit (AMD64)] on win32 pyexiv2 commit e78a78afc30e38e5ebab69a8bfe02209c1670241 Date: Fri Sep 6 20:37:58 2019 +0800 updated test cases

Not exactly sure how to debug the api but have successfully rebuilt api.dll under exiv2 https://github.com/Exiv2/exiv2/releases/tag/v0.27.2 without success of fixing.

I have further additional instructions for building under VS2017 which I can raise as an additional pull request, I was unable to build under VS2015

LeoHsiao1 commented 4 years ago

Thanks for your report. When I'm using your images https://github.com/auphofBSF/pyexiv2/commit/3890a83a3fa2c415aba23bbc2d4dbfcdc8325ef5, I can call .read_exif() and .read_iptc() normally, but calling .read_xmp() always causes Python to crash. The good news is that this problem can be repeated. I'll start debugging it now.

LeoHsiao1 commented 4 years ago

By the way, I used to compile exiv2 with VS2015, but the latest exiv2 project can only be compiled with VS2017.

LeoHsiao1 commented 4 years ago

I changed a little code, rebuilt, and then successfully read XMP. commit 715ce1d Please confirm whether the problem has been solved.

LeoHsiao1 commented 4 years ago

Yesterday, I tried to do this on Linux, and it reads normally. But doing this on Windows would cause Python to crash.

When calling the read_*() method, Python will call the C++ API, and the C++ API wil read the metadata and return it to Python. Since there are many data need to be returned, I use an EOL symbol to separate them.

What confuses me is that:

auphofBSF commented 4 years ago

Sounds like a tricky bug, your commit makes it at least work sometimes for more than 1 read but unfortunately it still fails for me with Python Crash after an seemingly unpredictable number of reads. sorry I cant contribute more to the cpp debugging

In my test I have a folder of 43 Nikon images processed by Digikam. The following code Fails at random number of iterations, 7,16,19,12,6,10,3 (no pattern evident). It has never processed all 43.

from os import listdir
from os.path import isfile, join
files = [f for f in listdir(mypath) if isfile(os.path.join(mypath, f))]

for i,f in enumerate(files):
    image = Image(os.path.join(mypath, f))
    try:
        xmpDict = image.read_xmp()
        print('{} | Image: {}  Xmp.digiKam.TagsList: {}  Xmp.dc.subject: {}   -----------------'.format(i, image.filename,
                xmpDict['Xmp.digiKam.TagsList'],
                xmpDict['Xmp.dc.subject']))

    #     xmpDictNew = xmpDict.copy()
    #     xmpDictNew['Xmp.dc.subject'] = xmpDict['Xmp.digiKam.TagsList']
    #     xmpDictNew['Xmp.dc.subject'] = "TEST"
    #     # image.modify_xmp(xmpDictNew)
    except Exception as e:
        print("Error:{}".format(e))
LeoHsiao1 commented 4 years ago

Random crashes are usually caused by memory exceptions. This problem seems to be caused by the use of local variables outside of scope, like this:

std::stringstream data;
data << "some data...";
make_buffer(data.str());
// the data which data.str() returns, may have been destroyed when the function make_buffer() reads it

I have corrected the use of stringstream.str(), and now none of the previous problems would be repeated on my computer. commit 028f6418

Please confirm whether the problem has been solved.

auphofBSF commented 4 years ago

Fantastic, @LeoHsiao1, I've learnt a lot and this now works well for the test case of 42 images. Well done. I am happy to mark this issue closed

github-actions[bot] commented 3 years ago

This issue has been automatically closed because there has been no activity for a month.