aristippe / pathagar

Pathagar is a simple bookserver serving OPDS feeds
GNU General Public License v2.0
1 stars 1 forks source link

Fix for unicode filenames on commands #34

Closed sinergatis closed 8 years ago

sinergatis commented 8 years ago

Improve handling of files with unicode filenames on addepub and resync commands, by decoding the args according to the file system encoding and enforcing the use of unicode literals.

Please test it on the files that you encountered problems with (only unicode on the source filename has been tested - didn't explicitly test edge cases such as the cover file having unicode filename), and merge if successful!

aristippe commented 8 years ago

Tried addepub and resync with a variety of characters (é, ç, à, ā, …) and it works great. Thank you!

aristippe commented 8 years ago

Maybe I spoke too soon. :) I had tried a combination of characters, but also a combination of addepub and resync. Addepub and resync seem to be ok, though there is an issue which might be related to unzipping. Out of about at least 5 or so common accents in European languages (will test more), the acute accent, like in "café" has problems. Might be related to some of my recent changes with manipulating file name. I'll try to look at it sometime. It freezes the import though it seems it's happening somewhere else, like I mention, perhaps in the Epub class, where various joins and other file name manipulations are done for temp dir, epub base_path (OEBPS/, OPS/, …), extracting the cover, and if it is an html file, retrieving the image within it when it's relative path, such as when combining OEBPS/Text/cover.html and ../Images/cover.jpg to get OEBPS/Images/cover.jpg.

sinergatis commented 8 years ago

Hmm, can't say I tested that case, but sounds like a problem indeed. If you could try to isolate the problem and provide a sample epub somewhere where that I can take a look (might be worth opening a unicode-problems issue, as we should check for problems with unicode in other places and the parent repo seems to mention it as well).

aristippe commented 8 years ago

I tried again and I couldn't reproduce it. Same files imported fine. I'll keep an eye out.