LeoHsiao1 / pyexiv2

Read and write image metadata, including EXIF, IPTC, XMP, ICC Profile.
GNU General Public License v3.0
201 stars 39 forks source link

Cannot read/write image in a folder with cyrillic name, on windows #21

Closed Fedik closed 4 years ago

Fedik commented 4 years ago

The library cannot read/write the image from folder that contain cyrillic symbols, on windows, however on Linux it works very well.

Example little thing:

# read
image_path = "C:\Users\TheUser\Desktop\test\тест призначення\electromecánico\good bike.JPG"
img = pyexiv2.Image(image_path)
metadata = img.read_xmp()
print(metadata)

# write
keywords = ["foo", "bar"]
new_meta = {'Xmp.dc.subject': keywords}
img.modify_xmp(new_meta)

Produce error Failed to open the data source: No such file or directory (errno = 2)

Traceback:

Traceback (most recent call last):
  File "C:\py-dev\test-meta\edit-xmp.py", line 15
    img = pyexiv2.Image(image_path)
  File "C:\py-dev\test-meta\env\lib\site-packages\pyexiv2\core.py", line 16, in __init__
    self.img = api.open_image(filename.encode(encoding))
RuntimeError: C:\Users\TheUser\Desktop\test\тест призначення\electromecánico\good bike.JPG: Failed to open the data source: No such file or directory (errno = 2)

api.open_image() does not see the file in some reason. If use path without cyrillic symbols then all works.

LeoHsiao1 commented 4 years ago

This error, No such file or directory , is usually reported for the following reasons:

What you're dealing with is obviously the second case.

Most of the functions provided by pyexiv2 include a default parameter: encoding='utf-8' If you encounter an error because the data contains non-ascii code characters, please try to change the encoding parameter. For example:

img = pyexiv2.Image(image_path, encoding='utf-8')
img = pyexiv2.Image(image_path, encoding='ISO 8859-5')

In addition, the default encoding for your Windows system may not be utf-8, so you cannot decode in utf-8.

Fedik commented 4 years ago

What you're dealing with is obviously the second case.

yeah, I suspect it, but not get why :wink: because

file = open(image_path, "r")

works very well, without any encoding (it detected internally).

Is there a way to give the file object to pyexiv2.Image instance? this could be as work around.

In addition, the default encoding for your Windows system may not be utf-8, so you cannot decode in utf-8.

That also an idea, I will try locale.getpreferredencoding() , will see what it will do, and will write you back.

LeoHsiao1 commented 4 years ago

In fact, pyexiv2 calls the C++ API of exiv2. Therefore, only one string representing the path of the file can be passed. Only with the correct encoding can the string be successfully identified.

Fedik commented 4 years ago

okay, I have tried locale.getpreferredencoding() but it does not help, it say default is cp1252, but if I use it then encoding fail with EncoderError

In fact, pyexiv2 calls the C++ API of exiv2

I understood,

However, how does native open() work with same exact path without path encoding. I even tried to use the path from it:

file = open(path, 'r');
img = pyexiv2.Image(file.name);

but still no luck

LeoHsiao1 commented 4 years ago

As far as I know, executing chcp in Windows CMD will display a number, which represents some kind of encoding format. As for Python's built-in function open(), it might guess the encoding format automatically. The Python3 interpreter supports Unicode characters very well, but C++ can only recognize ASCII code characters after encoding.

LeoHsiao1 commented 4 years ago

The default encoding format in Linux system is utf-8, so you can use utf-8 to decode file paths.

leo@Leo:~$ echo $LANG
C.UTF-8
Fedik commented 4 years ago

hm, with os.path.exists(path.encode('utf-8')) it say True.

I tried default encoding cp1252 but it not able to encode cyrillic, and if use cp1251 then it able to encode cyrillic but pyexiv2.Image stlll fail with exception.

well, okay

Fedik commented 4 years ago

@LeoHsiao1 one more question

Does it possible to implement this method static Image::AutoPtr open (const byte *data, long size) ? To load the image binary data.

So I can do:

the_file = open("in-file", "rb") 
data = the_file.read()

img = pyexiv2.Image(binary_data = data)

I think that would be nice feature.

LeoHsiao1 commented 4 years ago

It looks like it can be done, and I'll try to do it in the next week.

LeoHsiao1 commented 4 years ago

I've implemented this method and will release a new version in a few days.

LeoHsiao1 commented 4 years ago

Finally, I released v2.3.0. Add class ImageData. It is used to open an image from bytes data. I ran into some bugs when transferring bytes data, which caused me to take a week longer than I expected.

Fedik commented 4 years ago

@LeoHsiao1 thanks! works very well, I just tested

kolt54321 commented 4 years ago

Hi @LeoHsiao1 ! I'm having a bit of trouble following how to use the new class to solve cyrillic/other non-ascii names. The code right now looks like this:

from pyexiv2 import Image as TaggedImage
xmp_file_obj = TaggedImage(final_image_filename) # Image exists but with non-ascii letters
xmp_file_obj.modify_xmp(info.metadata) # XMP data to be embed

Is it as simple as changing the first line to:

from pyexiv2 import ImageData as TaggedImage

Unfortunately it still doesn't work for me...

github-actions[bot] commented 3 years ago

This issue has been automatically closed because there has been no activity for a month.