LeoHsiao1 / pyexiv2

Read and write image metadata, including EXIF, IPTC, XMP, ICC Profile.
GNU General Public License v3.0
201 stars 39 forks source link

Current master is failing build test on Ubuntu 18.0.4.5 #41

Closed dgtlmoon closed 3 years ago

dgtlmoon commented 3 years ago

Python3.6 on ubuntu here

git clone https://github.com/LeoHsiao1/pyexiv2.git
cd pyexiv2
cd pyexiv2/lib/
g++ exiv2api.cpp -o py36-linux/exiv2api.so -O3 -Wall -std=c++11 -shared -fPIC `python3 -m pybind11 --includes` -I /usr/include/exiv2 -I /usr/include/python3 -L ./lib -l exiv2
cd ../..
python3 setup.py install
pytest -v

pyexiv2/tests/test_func.py::test_read_exif PASSED                                                                                                                                                          [  3%]
pyexiv2/tests/test_func.py::test_read_iptc PASSED                                                                                                                                                          [  7%]
pyexiv2/tests/test_func.py::test_read_xmp PASSED                                                                                                                                                           [ 11%]
pyexiv2/tests/test_func.py::test_read_raw_xmp FAILED                                                                                                                                                       [ 15%]
pyexiv2/tests/test_func.py::test_modify_exif PASSED                                                                                                                                                        [ 19%]
pyexiv2/tests/test_func.py::test_modify_iptc PASSED                                                                                                                                                        [ 23%]
pyexiv2/tests/test_func.py::test_modify_xmp PASSED                                                                                                                                                         [ 26%]
pyexiv2/tests/test_func.py::test_clear_exif PASSED                                                                                                                                                         [ 30%]
pyexiv2/tests/test_func.py::test_clear_iptc PASSED                                                                                                                                                         [ 34%]
pyexiv2/tests/test_func.py::test_clear_xmp PASSED                                                                                                                                                          [ 38%]
pyexiv2/tests/test_func.py::test_nonexistent_path PASSED                                                                                                                                                   [ 42%]
pyexiv2/tests/test_func.py::test_not_image_path PASSED                                                                                                                                                     [ 46%]
pyexiv2/tests/test_func.py::test_error_log PASSED                                                                                                                                                          [ 50%]
pyexiv2/tests/test_func_on_ImageData.py::test_read_all FAILED                                                                                                                                              [ 53%]
pyexiv2/tests/test_func_on_ImageData.py::test_modify_exif PASSED                                                                                                                                           [ 57%]
pyexiv2/tests/test_func_on_ImageData.py::test_modify_iptc PASSED                                                                                                                                           [ 61%]
pyexiv2/tests/test_func_on_ImageData.py::test_modify_xmp PASSED                                                                                                                                            [ 65%]
pyexiv2/tests/test_func_on_ImageData.py::test_clear_all PASSED                                                                                                                                             [ 69%]
pyexiv2/tests/test_func_on_ImageData.py::test_error_log PASSED                                                                                                                                             [ 73%]
pyexiv2/tests/test_perf.py::test_memory_leak_when_reading FAILED                                                                                                                                           [ 76%]
pyexiv2/tests/test_perf.py::test_memory_leak_when_writing PASSED                                                                                                                                           [ 80%]
pyexiv2/tests/test_perf.py::test_stack_overflow PASSED                                                                                                                                                     [ 84%]
pyexiv2/tests/test_perf.py::test_transmit_various_characters PASSED                                                                                                                                        [ 88%]
pyexiv2/tests/test_perf_on_ImageData.py::test_memory_leak_when_reading FAILED                                                                                                                              [ 92%]
pyexiv2/tests/test_perf_on_ImageData.py::test_memory_leak_when_writing PASSED                                                                                                                              [ 96%]
pyexiv2/tests/test_perf_on_ImageData.py::test_stack_overflow PASSED                                                                                                                                        [100%]

==================================================================================================== FAILURES ====================================================================================================
_______________________________________________________________________________________________ test_read_raw_xmp ________________________________________________________________________________________________

    @check_md5
    def test_read_raw_xmp():
        with Image(path) as img:
>           assert len(img.read_raw_xmp()) == 4593
E           assert 4635 == 4593
E             +4635
E             -4593

pyexiv2/tests/test_func.py:28: AssertionError
_________________________________________________________________________________________________ test_read_all __________________________________________________________________________________________________

    @check_md5
    def test_read_all():
        with open(path, 'rb') as f:
            with ImageData(f.read()) as img:
                compare_dict(testdata.EXIF, img.read_exif())
                compare_dict(testdata.IPTC, img.read_iptc())
                compare_dict(testdata.XMP, img.read_xmp())
>               assert len(img.read_raw_xmp()) == 4593
E               assert 4635 == 4593
E                 +4635
E                 -4593

pyexiv2/tests/test_func_on_ImageData.py:12: AssertionError
_________________________________________________________________________________________ test_memory_leak_when_reading __________________________________________________________________________________________

    @check_md5
    def test_memory_leak_when_reading():
        p = psutil.Process(os.getpid())
        m0 = p.memory_info().rss
        for _ in range(1000):
            test_func.test_read_exif()
            test_func.test_read_iptc()
            test_func.test_read_xmp()
>           test_func.test_read_raw_xmp()

pyexiv2/tests/test_perf.py:16: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
pyexiv2/tests/base.py:37: in wrapper
    ret = func(*args, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

    @check_md5
    def test_read_raw_xmp():
        with Image(path) as img:
>           assert len(img.read_raw_xmp()) == 4593
E           assert 4635 == 4593
E             +4635
E             -4593

pyexiv2/tests/test_func.py:28: AssertionError
_________________________________________________________________________________________ test_memory_leak_when_reading __________________________________________________________________________________________

    @check_md5
    def test_memory_leak_when_reading():
        p = psutil.Process(os.getpid())
        m0 = p.memory_info().rss
        for _ in range(1000):
>           test_func_on_ImageData.test_read_all()

pyexiv2/tests/test_perf_on_ImageData.py:13: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
pyexiv2/tests/base.py:37: in wrapper
    ret = func(*args, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

    @check_md5
    def test_read_all():
        with open(path, 'rb') as f:
            with ImageData(f.read()) as img:
                compare_dict(testdata.EXIF, img.read_exif())
                compare_dict(testdata.IPTC, img.read_iptc())
                compare_dict(testdata.XMP, img.read_xmp())
>               assert len(img.read_raw_xmp()) == 4593
E               assert 4635 == 4593
E                 +4635
E                 -4593

pyexiv2/tests/test_func_on_ImageData.py:12: AssertionError
============================================================================================ short test summary info =============================================================================================
FAILED pyexiv2/tests/test_func.py::test_read_raw_xmp - assert 4635 == 4593
FAILED pyexiv2/tests/test_func_on_ImageData.py::test_read_all - assert 4635 == 4593
FAILED pyexiv2/tests/test_perf.py::test_memory_leak_when_reading - assert 4635 == 4593
FAILED pyexiv2/tests/test_perf_on_ImageData.py::test_memory_leak_when_reading - assert 4635 == 4593
========================================================================================= 4 failed, 22 passed in 10.42s ==========================================================================================
d````
LeoHsiao1 commented 3 years ago

These errors are due to the failure of the Test case test_read_raw_xmp(). The raw XMP length read is different than expected.

LeoHsiao1 commented 3 years ago

I need to find out what's different. Please manually read the raw XMP of the test image and copy it to this issue, including the space character.

$ cd pyexiv2/
$ ls pyexiv2/tests/1.jpg              # The test image is here
pyexiv2/tests/1.jpg
$ python3
Python 3.6.9 (default, Nov  7 2019, 10:44:02) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyexiv2
>>> img = pyexiv2.Image('pyexiv2/tests/1.jpg')
>>> img.read_raw_xmp().encode()
b'<?xpacket begin="\xef\xbb\xbf" id="W5M0MpCehiHzreSzNTczkc9d"?> ...
dgtlmoon commented 3 years ago

Screenshot_2020-12-27_13-08-05 Output attached.. lots of whitespace in the end (approximately 2050bytes of 0x20h) output.txt

LeoHsiao1 commented 3 years ago

I found that even the XMP ID was not what I expected. Please check if your test images have been modified. Whether the hash value is consistent with this:

$ md5sum pyexiv2/tests/1.jpg
81760130b2ed488646c2bf9fd485cb03  1.jpg
dgtlmoon commented 3 years ago

After test..

dgtlmoon@acer:/tmp/x/pyexiv2$ md5sum pyexiv2/tests/1.jpg
81760130b2ed488646c2bf9fd485cb03  pyexiv2/tests/1.jpg

After build

dgtlmoon@acer:/tmp/x/pyexiv2$ git status .
On branch master
Your branch is up to date with 'origin/master'.

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

    modified:   pyexiv2/lib/py36-linux/exiv2api.so

repo reset..

dgtlmoon@acer:/tmp/x/pyexiv2$ git reset --hard
HEAD is now at 3b18062 Update the defect description
dgtlmoon@acer:/tmp/x/pyexiv2$ md5sum pyexiv2/tests/1.jpg
81760130b2ed488646c2bf9fd485cb03  pyexiv2/tests/1.jpg
dgtlmoon@acer:/tmp/x/pyexiv2$ ls -al pyexiv2/tests/1.jpg
-rw-rw-r-- 1 dgtlmoon dgtlmoon 18031 Dec 27 13:03 pyexiv2/tests/1.jpg

I just ran it a second time.. and i got a pass

image

After test..

dgtlmoon@acer:/tmp/x/pyexiv2$ md5sum pyexiv2/tests/1.jpg
81760130b2ed488646c2bf9fd485cb03  pyexiv2/tests/1.jpg
dgtlmoon commented 3 years ago

So on checkout master build and test it fails.. but sometimes on the second time it will pass

LeoHsiao1 commented 3 years ago

The test case has been executed on multiple platforms:

      matrix:
        os: [ubuntu-18.04, macOS-latest, windows-2019]
        python-version: [3.5, 3.6, 3.7, 3.8, 3.9]

I haven't had this problem before. In a similar situation, Windows handles space characters differently than Linux.

dgtlmoon commented 3 years ago

Well.. it's still important to figure out why I think...

dgtlmoon commented 3 years ago

Can you explain this comment "Windows handles space characters differently than Linux."

What exactly, how exactly does this affect the tests? I'm not a mind reader :)

LeoHsiao1 commented 3 years ago

For example, when Exiv2 outputs image metadata to a terminal on Windows, it shows the newline character as \r\n and the path separator as \. However, pyexiv2 does not output data to the terminal, so it has not encountered this problem.

LeoHsiao1 commented 3 years ago

The probability that Windows is related to your issue is too low to go into detail.

dgtlmoon commented 3 years ago

@LeoHsiao1 So what you are saying, if i understand your assumption correctly, is that you've written the tests in a way that they will only pass on windows? that's a horrible idea!


@check_md5
def test_read_raw_xmp():
    with Image(path) as img:
        assert len(img.read_raw_xmp()) == 4593

so that 4593 will only work in windows ?!?!

LeoHsiao1 commented 3 years ago

No, my test cases have been executed under 15 circumstances: https://github.com/LeoHsiao1/pyexiv2/actions/runs/448497877

      matrix:
        os: [ubuntu-18.04, macOS-latest, windows-2019]
        python-version: [3.5, 3.6, 3.7, 3.8, 3.9]

As you can see from the GitHub Actions page, most of the recent test results have passed.

dgtlmoon commented 3 years ago

According to the specifications, the whitespace padding is part of the XMP standard... so any change here means that something has changed and we need to know exactly what..

to set an arbitrary integer for comparison if your test passed without explanation or the original information on how you got that number is really not good enough for other programmers...

https://archimedespalimpsest.net/Documents/External/XMP/XMPSpecificationPart3.pdf

image

dgtlmoon commented 3 years ago

Can you paste in a hexdump of what you are seeing? I'm using..


@check_md5
def test_read_raw_xmp():
    with Image(path) as img:
        x= ":".join("{:02x}".format(ord(c)) for c in img.read_raw_xmp())
        print( x)
        assert len(img.read_raw_xmp()) == 4593
dgtlmoon commented 3 years ago

Also... maybe the libexiv is different? which version are you running under Ubuntu 18?

$ dpkg -l|grep -i libexiv
ii  libexiv2-14:amd64                             0.25-3.1ubuntu0.18.04.5                           amd64        EXIF/IPTC/XMP metadata manipulation library
ii  libexiv2-dev                                  0.25-3.1ubuntu0.18.04.5                           amd64        EXIF/IPTC/XMP metadata manipulation library - development files
dgtlmoon commented 3 years ago

https://wwwimages2.adobe.com/content/dam/acom/en/devnet/xmp/pdfs/XMP%20SDK%20Release%20cc-2016-08/XMPSpecificationPart1.pdf

NOTE 3 When XMP is embedded within digital files, including white-space padding is sometimes helpful. Doing sofacilitates modification of the XMP packet in-place. The rest of the file is unaffected, which could eliminate a need to rewritethe entire file if the XMP changes in size. Appropriate padding is SPACE characters placed anywhere white space isallowed by the general XML syntax and XMP serialization rules, with a linefeed (U+000A) every 100 characters or so toimprove human display. The amount of padding is workflow-dependent; around 2000 bytes is often a reasonable amount.

So something is wrong... if that padding changes under any libexiv2 version and your library... we should know about it..

LeoHsiao1 commented 3 years ago

Happy New Year! I uploaded a file containing the expected RAW XMP. raw_xmp.txt

Do you still have test_read_raw_xmp() fail? The current version of Exiv2 I am using is 0.27.2.

dgtlmoon commented 3 years ago

My output in the text file..

@check_md5
def test_read_raw_xmp():
    with Image(path) as img:
        f = open("2.5-output.txt", "w")
        f.write(img.read_raw_xmp())
        f.close()
        assert len(img.read_raw_xmp()) == 4593

So here's what I have..

 <x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="Adobe XMP Core 5.6-c128 79.159124, 2016/03/18-14:01:55        ">

And what you have.....

 <x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="XMP Core 4.4.0-Exiv2">

Checking the result based on an integer of the total length is a horrible idea.. better is to hardcode what values in the XML (parse the XML in Python, it is easy) and check for the strings..

Here is the output of what I'm seeing.. hope it helps..

output.txt

dgtlmoon commented 3 years ago

Here's the story.. 4593 is written in your test, 4635 is what I get...

>>> 4635- 4593
42
>>> s='<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="Adobe XMP Core 5.6-c128 79.159124, 2016/03/18-14:01:55        ">'
>>> len(s)
109
>>> 
>>> s='<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="XMP Core 4.4.0-Exiv2">'
>>> len(s)
67
>>> 109-67
42
LeoHsiao1 commented 3 years ago

The actual value stored in the image is:

x:xmptk="Adobe XMP Core 5.6-c128 79.159124, 2016/03/18-14:01:55        "

But after Exiv2 parses it, it returns:

x:xmptk="XMP Core 4.4.0-Exiv2"

Maybe you're using a different version of Exiv2.

In summary, test_read_xmp() is used to check that the content of XMP are correct. test_read_raw_xmp() is just used to check that raw XMP can be read out, not to check the content.

dgtlmoon commented 3 years ago

@LeoHsiao1 You seem to be just repeating exactly what I'm writing :'(

LeoHsiao1 commented 3 years ago

Well, on the one hand, I'll update the test case so that it can locate the differences. On the other hand, your test case test_read_raw_xmp() didn't pass, is it because you're using version 0.25 of Exiv2 ?

dgtlmoon commented 3 years ago

That's right... that's what I said :) but... how do we know it's because of 0.25 and not 0.26.4 etc?

dgtlmoon commented 3 years ago

I need to find out where "Adobe XMP Core 5.6-c128 79.159124" is coming from exactly in my system, because I'm on Ubuntu 18 also.. @LeoHsiao1 i'm on ubuntu buster (18.04)

$ cat /etc/debian_version 
buster/sid
dgtlmoon@acer:/tmp/pyexiv2$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 18.04.5 LTS
Release:    18.04
Codename:   bionic
LeoHsiao1 commented 3 years ago

Test cases always pass on Ubuntu 18.04 . See Github Actions And my WSL is also Ubuntu 18.04 :

leo@Leo:~$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04.2 LTS"
dgtlmoon commented 3 years ago

@LeoHsiao1 I know this, you already said that.. that comment does not add much

But I'm running 18.04.5.... i think it is important that this passes (I'm repeating myself too on this point).. otherwise other programmers cannot test/build locally and will waste hours of their life (like I have already) trying to figure out what is wrong

dgtlmoon commented 3 years ago

So it works on

DISTRIB_DESCRIPTION="Ubuntu 18.04.2 LTS" (you)

but not

DISTRIB_DESCRIPTION="Ubuntu 18.04.5 LTS" (me)

.... not good enough I think..

dgtlmoon commented 3 years ago

I dont have "Adobe XMP Core" installed on my machine.. which is what is in the original test image (1.jpg).. And maybe "Adobe XMP Core 5.6-c128 79.159124, 2016/03/18-14:01:55" is not installed on the github actions test machine..

so there for the test is false-positive ?

LeoHsiao1 commented 3 years ago

No, the raw data is stored in the image, regardless of whether you have Adobe installed or not. I just updated the test case, and now it checks if each character of RAW_XMP is different.

dgtlmoon commented 3 years ago

"if each character of RAW_XMP is different." why?

LeoHsiao1 commented 3 years ago

Check if each character of RAW_XMP is the same...

dgtlmoon commented 3 years ago

Are you trying to ignore problems with whitespace? I dont understand your assumption

read_xmp() does not give me "XMP Toolkit "

"XMP toolkit" is not available in read_xmp().. that is interesting... it's the one variable that is changing and breaking the tests on 18.0.4.5

@check_md5
def test_read_raw_xmp():
    with Image(path) as img:
        p=img.read_xmp()
LeoHsiao1 commented 3 years ago

No, test case test_read_raw_xmp() is more rigorous now, checking if each character of RAW_XMP is the same.

@check_md5
def test_read_raw_xmp():
    with Image(test_img) as img:
        diff_text(reference_data.RAW_XMP, img.read_raw_xmp())
LeoHsiao1 commented 3 years ago

test_read_raw_xmp now does not report an error, indicating that the XMP Toolkit is not a variable.

dgtlmoon commented 3 years ago

Yes, OK, like I said, looking for an integer to compare isn't the best solution..

let me know if this in dev and I'll try it

@LeoHsiao1 Final thing.. I beg you, please be in the habit of including comments in your code.. you will save a lot of programmers in the future a lot of hours of wasted time...

@check_md5
def test_read_raw_xmp():

  # We compare only the headers available, XMP headers such as "XMP Toolkit" are not available in the read_raw_xmp()
  # information and could affect the output (So we no longer check for an absolute length/integer size)

    with Image(test_img) as img:
        diff_text(reference_data.RAW_XMP, img.read_raw_xmp())
LeoHsiao1 commented 3 years ago

This test case is very simple. You can understand what it does by the name of the function diff_text. This test case is rigorous and checks each character, not just the XMP Toolkit.

dgtlmoon commented 3 years ago

@LeoHsiao1 I'm begging you.. it's important to write such a comment... trust me, please... PLEASE... please.... please add this kind of documentation

// Because the length of XMP padding can affect the final output, we compare the actual strings instead of comparing byte lengths
LeoHsiao1 commented 3 years ago

I don't understand your comment. compare the actual strings and comparing byte lengths are not in conflict. If compare the actual strings test passes, then comparing byte lengths test will also pass.

LeoHsiao1 commented 3 years ago

I used to check:

len(excepted_raw_xmp) == len(test_raw_xmp)

Now I'm checking:

excepted_raw_xmp == test_raw_xmp
LeoHsiao1 commented 3 years ago

Instead of ignoring this issue, I made the test cases more rigorous. If you didn't pass the test case test_read_raw_xmp() before, you're even less likely to pass it now. But the error message is more specific.

dgtlmoon commented 3 years ago
_______________________________________________________________________________ test_read_raw_xmp ________________________________________________________________________________

    @check_md5
    def test_read_raw_xmp():
        with Image(test_img) as img:
>           diff_text(reference_data.RAW_XMP, img.read_raw_xmp())

pyexiv2/tests/test_func.py:28: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

text1 = '<?xpacket begin="\ufeff" id="W5M0MpCehiHzreSzNTczkc9d"?> <x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="XMP Core 4.4.0-...                                                                                                   <?xpacket end="w"?>'
text2 = '<?xpacket begin="\ufeff" id="W5M0MpCehiHzreSzNTczkc9d"?> <x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="Adobe XMP Core ...                                                                                                   <?xpacket end="w"?>'

    def diff_text(text1: (str, bytes), text2: (str, bytes)):
        max_len = max(len(text1), len(text2))
        for i in range(max_len):
>           assert text1[i:i+1] == text2[i:i+1], "The two text is different at index {} :\n< {}\n> {}".format(i, text1[i:i+10], text2[i:i+10])
E           AssertionError: The two text is different at index 97 :
E           < XMP Core 4
E           > Adobe XMP

pyexiv2/tests/base.py:46: AssertionError

I dont understand why the EXIF4 header is being re-written.. I think you have not helped to find out the real cause of the problem

dgtlmoon commented 3 years ago

I'm going crazy here, it's like i say something, and then you just repeat what I've already said and we go back to the start :(

dgtlmoon commented 3 years ago

When I look at tmp.jpg I can see that still says "Adobe XMP".. (the original test file 1.jpg)

but in the test output, it fails... that is the problem

LeoHsiao1 commented 3 years ago

I think the problem is very simple:

  1. Your test case reads raw XMP differently.
  2. I found that the MD5 value of test image was not changed, but Exiv2 returned raw XMP incorrectly.
  3. We need to find out the cause of this problem. In the meantime, I've made the test cases more rigorous.
LeoHsiao1 commented 3 years ago

If you've been using different versions of EXIV2, how do you find out what's causing the problem?

That's right... that's what I said :) but... how do we know it's because of 0.25 and not 0.26.4 etc?

dgtlmoon commented 3 years ago

In your RAW_XMP you are looking for XMP Core 4.4.0-Exiv2, but in the YOUR image (.jpg), you have..

root@acer:/tmp/pyexiv2# exiftool ./pyexiv2/tests/1.jpg |grep Toolki
XMP Toolkit                     : Adobe XMP Core 5.6-c128 79.159124, 2016/03/18-14:01:55

Proof...


@check_md5
def test_read_raw_xmp():

    import subprocess
    import time
    output = subprocess.check_output("/usr/bin/exiftool {}|grep -i 'xmp toolkit'".format(test_img), shell=True)
    print("Looking at {}, I see {} ---->".format(test_img, output))  
    time.sleep(1) # so we dont fork to background and possibily overwrite it
    with Image(test_img) as img:
        diff_text(reference_data.RAW_XMP, img.read_raw_xmp())
text1 = '<?xpacket begin="\ufeff" id="W5M0MpCehiHzreSzNTczkc9d"?> <x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="XMP Core 4.4.0-...                                                                                                   <?xpacket end="w"?>'
text2 = '<?xpacket begin="\ufeff" id="W5M0MpCehiHzreSzNTczkc9d"?> <x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="Adobe XMP Core ...                                                                                                   <?xpacket end="w"?>'

    def diff_text(text1: (str, bytes), text2: (str, bytes)):
        max_len = max(len(text1), len(text2))
        for i in range(max_len):
>           assert text1[i:i+1] == text2[i:i+1], "The two text is different at index {} :\n< {}\n> {}".format(i, text1[i:i+10], text2[i:i+10])
E           AssertionError: The two text is different at index 97 :
E           < XMP Core 4
E           > Adobe XMP

pyexiv2/tests/base.py:46: AssertionError
------------------------------------------------------------------------------ Captured stdout call ------------------------------------------------------------------------------
Looking at /tmp/pyexiv2/pyexiv2/tests/test.jpg, I see b'XMP Toolkit                     : Adobe XMP Core 5.6-c128 79.159124, 2016/03/18-14:01:55\n' ---->

This test will only pass if something horrible is wrong

dgtlmoon commented 3 years ago

I found that the MD5 value of test image was not changed, but Exiv2 returned raw XMP incorrectly.

OK.. getting somewhere... but I think your test is weird (https://github.com/LeoHsiao1/pyexiv2/issues/41#issuecomment-753612040)

dgtlmoon commented 3 years ago

Please change RAW_XMP in reference_data.py to be the same as what is in 1.jpg

$ strings ./pyexiv2/tests/1.jpg |grep Adobe
" id="W5M0MpCehiHzreSzNTczkc9d"?> <x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="Adobe XMP Core 5.6-c128 79.159124, 2016/03/18-14:01:55        ">

It is wrong and should not beXMP Core 4.4.0-Exiv2

dgtlmoon commented 3 years ago

You will need to include the whitespace padding in that change btw..