paolobenve / myphotoshare

MOVED TO GITLAB! --- A Web 2.0 Photo Gallery Done Right via Static JSON, Dynamic Javascript and a bit of php for sharing
15 stars 0 forks source link

album.ini metadata has trouble with Python 2 when filenames have accented characters #79

Closed pmetras closed 6 years ago

pmetras commented 6 years ago

With Python 2, media files or directory names having accented characters (more generally non-ASCII characters) are not correctly processed when looking for user defined metadata in album,ini file. This does not happens with Python 3 and documented behavior happens.

Here is an example of the problem. Directory contains photo name La Côte-d'Azur.jpg. User has created section named [La Côte-d'Azur.jpg] in album.ini file to define metadata. Scanner will not match the file name with the section name and won't overload the metadata by the user defined ones.

I'll look at it this week-end.

paolobenve commented 6 years ago

isn't it a problem with album.ini encoding, != utf8?

Anyway, is there any reason to mantain python2 compatibility?

pmetras commented 6 years ago

No, the scanner has different behavior, on the same files, depending on which version of Python is used.

Well, at least on my computer, OpenCV cv2 module is not yet available for Python 3. In a few months, it should be fine to forget about Python 2.

pmetras commented 6 years ago

The cause of the problem was that os.listdir() returns a unicode object while ConfigParser returns str even if the file contains UTF-8 content.

Example: The file name is: Bas-relief du caducée d'Hermès.jpg Name of file as obtained from os.listdir(): u"Bas-relief du caduc\xe9e d'Herm\xe8s.jpg" Name of section in ConfigParser: "Bas-relief du caduc\xc3\xa9e d'Herm\xc3\xa8s.jpg"

Correction is to convert the filename to str before looking for section names.

paolobenve commented 6 years ago

good!