Closed RalfPeter closed 1 month ago
Thank you for sharing. ISO-8859-1 is a special encoding format that uses all encoded values in the range 0x00 ~ 0xFF. Therefore, bytes in any encoded format can be decoded with ISO-8859-1 format, but the original meaning may be lost. For example:
>>> data = '你好'.encode('gbk')
>>> data.decode('utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc4 in position 0: invalid continuation byte
>>> data.decode('gbk')
'你好'
>>> data.decode('ISO-8859-1')
'ÄãºÃ'
In conclusion, ISO-8859-1 is not a one-size-fits-all encoding format. Users are advised to find the original encoding format of the data.
I am closing this issue now. Hopefully others with the same problem will search for this issue.
Thank you Leo, my only idea was that anybody with the same problem will have an advice how to use encoding with your pyexiv2. Thank you for your additional comment.
Good morning, "For all those who have encountered difficulties using the functions read_exif, read_xmp, and read_iptc. From time to time, my script crashed with a runtime error without any further error message. I suspected the cause was in the coding of exiv2 and experimented with different images. I discovered that some images contained metadata that was not UTF-8 encoded, but rather ISO-8859-1. So I wrote the following routines (analogously, of course, for XMP, IPTC, and comments as well). Perhaps someone will find it helpfull: