dicer / auto-tatort

Kleines Script um die aktuellen Tatort Folgen automatisiert (cron) aus der ARD Mediathek zu laden
GNU General Public License v3.0
19 stars 9 forks source link

Error when downloading episodes with UTF-8 characters in their names #12

Closed psy-q closed 7 years ago

psy-q commented 8 years ago

At least I think that's the problem:

DiskStation$ python autoTatort.py 
Traceback (most recent call last):
  File "autoTatort.py", line 218, in <module>
    if (os.path.isfile(fullFileName)) == True:
  File "/usr/local/lib/python2.7/genericpath.py", line 29, in isfile
    st = os.stat(path)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xc4' in position 45: ordinal not in range(128)
dicer commented 8 years ago

I really hate python for string handling...

If this happens again, please try swapping line 217 to this: fullFileName = u"" + targetDir + fileName + ".mp4"

Or if you know your way around python, try adding a u in front of strings. Kinda like this old commit which never made it in cause it doesn't happen to me: https://github.com/dicer/auto-tatort/commit/d47be7c7a88eb106cf89965a20e9712649b500f2

psy-q commented 8 years ago

Hmm. This was using a Python version compiled for PPC on a Synology NAS. I've now tried autoTatort on a RaspberryPi (mounting that NAS) and that seems to work fine. So it may be something about the Python version on the NAS and autoTatort is innocent.

psy-q commented 7 years ago

I'm now having the issue again since I tried moving my autoTatort.py back to the NAS so it doesn't require an additional external device. I can reproduce it using a freshly installed Synology NAS with the latest Disk Station Manager (that's what Synology call their OS) and the official Python package by the Python Foundation from the package manager.

The line numbers have moved a little meanwhile:

Traceback (most recent call last):
  File "autoTatort.py", line 217, in <module>
    if (os.path.isfile(fullFileName)) == True:
  File "/usr/lib/python2.7/genericpath.py", line 37, in isfile
    st = os.stat(path)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 59: ordinal not in range(128)

Python is at 2.7.12. Would it be possible to perhaps specify an explicit codec so that the filename can be written, or to convert filenames with non-ASCII characters to ASCII? The filesystem itself can take UTF8 fine, it's ext4.

psy-q commented 7 years ago

Solved it, I think. The "Task Scheduler" component of a Synology NAS does not inherit the locale that is set system-wide in the settings. I have to set the "User-defined script" part of the scheduled task configuration as follows:

LC_ALL=en_US.UTF-8 /bin/python autoTatort.py

This makes it work. Various documentation says that you should never set LC_ALL unless you're debugging something, but I don't know which of the locale env vars is responsible for the filesystem encoding or the codec that Python chooses, so I simply set LC_ALL.