pywikibot-catfiles / file-metadata

A python package to analyze files and provide useful metadata
MIT License
15 stars 1 forks source link

Possible UnicodeDecodeError on repeated to_cstr use or use on already cstr string #39

Closed zhuyifei1999 closed 8 years ago

zhuyifei1999 commented 8 years ago

https://github.com/AbdealiJK/file-metadata/blob/1c0de640a142a50b819bff12345f3b9ee548be63/file_metadata/utilities.py#L50:

value.encode('utf-8')

But:

$ python
Python 2.7.6 (default, Jun 22 2015, 17:58:13) 
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> u"啊"
u'\u554a'
>>> u"啊".encode('utf-8')
'\xe5\x95\x8a'
>>> u"啊".encode('utf-8').encode('utf-8')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe5 in position 0: ordinal not in range(128)
AbdealiLoKo commented 8 years ago

Fixed in https://github.com/AbdealiJK/file-metadata/commit/c755e751fb2259aa5cafdd5d1f6fc097f5698aa7