dbr / tvnamer

Automatic TV episode file renamer, uses data from thetvdb.com via tvdb_api
https://pypi.python.org/pypi/tvnamer/
The Unlicense
909 stars 115 forks source link

Episodes with special characters #141

Closed phillipssc closed 3 years ago

phillipssc commented 6 years ago

This is very much like the closed issue https://github.com/dbr/tvnamer/issues/16 but I have a recent version, 2.3 and upgrading to the latest was the fix for the closed issue. In my case, a returned episode title has the "é" character in it - which causes the script to fail.

The command is run through a script and tvnamer is launched as follows:

/usr/local/bin/tvnamer -b -m --force-move -d "/home/sean/video/TV/%(seriesname)s/Season %(seasonnumber)d" "Another Period S03E02.mp4" 2>&1 | tee -a "/tmp/Another Period S03E02.log"

####################
# Starting tvnamer
# Found 1 episode
####################
# Processing file: Another Period S03E02.mp4
# Detected series: Another Period (season: 3, episode: 2)
####################
Old filename: Another Period S03E02.mp4
New filename: Another Period - [03x02] - Séance.mp4
New path: /home/sean/temp/daaef588/Another Period - [03x02] - Séance.mp4
Traceback (most recent call last):
  File "/usr/local/bin/tvnamer", line 9, in <module>
    load_entry_point('tvnamer==2.3', 'console_scripts', 'tvnamer')()
  File "build/bdist.linux-x86_64/egg/tvnamer/main.py", line 458, in main
  File "build/bdist.linux-x86_64/egg/tvnamer/main.py", line 364, in tvnamer
  File "build/bdist.linux-x86_64/egg/tvnamer/main.py", line 227, in processFile
  File "build/bdist.linux-x86_64/egg/tvnamer/main.py", line 92, in doRenameFile
  File "build/bdist.linux-x86_64/egg/tvnamer/utils.py", line 1088, in newPath
  File "/usr/lib/python2.7/genericpath.py", line 37, in isfile
    st = os.stat(path)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 53: ordinal not in range(128)

The tvnamer command line in the script works flawlessly except for this one episode so far. Please let me know if you need any more info...

bhorstmann commented 6 years ago

I ran into this same issue. I managed to workaround it by editing the following line in my config file:

    "normalize_unicode_filenames": true,

This does a simple convert from é to e.

While I would prefer to have the unicode characters in my filenames, this was an acceptable workaround for me.

dbr commented 6 years ago

Hmm, could you test with the latest version 2.5 release? Don't recall if there was anything to do with the non-ASCII support changed since 2.3, but latest version appears to work for me, under both Python 2.7 and 3.7:

$ touch 'Another Period S03E02.mp4'
$ tvnamer Another\ Period\ S03E02.mp4
####################
# Starting tvnamer
# Found 1 episode
####################
# Processing file: Another Period S03E02.mp4
# Detected series: Another Period (season: 3, episode: 2)
TVDB Search Results:
1 -> Another Period [en] # http://thetvdb.com/?tab=series&id=284219&lid=7 (default)
Automatically selecting only result
####################
Old filename: Another Period S03E02.mp4
New filename: Another Period - [03x02] - Séance.mp4
Rename?
([y]/n/a/q)
Renaming
New path: /Users/dbr/code/tvnamer/Another Period - [03x02] - Séance.mp4
rename /Users/dbr/code/tvnamer/Another Period S03E02.mp4 to /Users/dbr/code/tvnamer/Another Period - [03x02] - Séance.mp4
lamahmud commented 6 years ago

I've also noticed this issue with names with an apostrophe ' in it. Example is Tom Clancy's Jack Ryan which tvnamer fails to find on tvdb and when you force it with the --series-id= flag, it names it "Tom Clancy& # 39 ;s Jack Ryan" <-- I had to add spaces between the & and # and 39 because apparently this form properly translates that to an apostrophe.

DJ73 commented 4 years ago

I noticed a similar issue with '.'

Example: Magnum P. I. was failed to be identified. It searched for Magnum P I and didn't find any results for some reason. Works fine if I force the series name

dbr commented 3 years ago

Closing old ticket, don't think the encoding issue should still be a problem (especially in latest v4.0-dev under Python 3 as none of the ASCII string code will exist any more)

Example: Magnum P. I. was failed to be identified. It searched for Magnum P I and didn't find any results for some reason. Works fine if I force the series name

This is a different issue - not sure if it's improved in the last year or so, but the TVDB API search is fairly basic, and can be confused by differing punctuation (or even things like & versus and). Not much tvnamer can do to work around this as a fix which might work for one show may break another