JohnDoee / autotorrent

Matches torrents with files and gets them seeded
MIT License
269 stars 34 forks source link

Looks like utf-8 is having issues with some torrents from WCD #7

Closed b0xy closed 9 years ago

b0xy commented 9 years ago

INFO:autotorrent:Handling file /home/btn/wcd/Bill Evans Trio - Explorations - 2000 (CD - MP3 - 320)-30036626.torrent INFO:autotorrent:Found name u'Bill Evans - Explorations [320]' for torrent Traceback (most recent call last): File "./autotorrent", line 9, in load_entry_point('autotorrent==1.6.0', 'console_scripts', 'autotorrent')() File "/home/mgmt/autotorrent-env/local/lib/python2.7/site-packages/autotorrent/cmd.py", line 214, in commandline_handler result = at.handle_torrentfile(os.path.join(current_path, torrent), dry_run) File "/home/mgmt/autotorrent-env/local/lib/python2.7/site-packages/autotorrent/at.py", line 417, in handle_torrentfile found_size, missing_size, files = self.parse_torrent(torrent) File "/home/mgmt/autotorrent-env/local/lib/python2.7/site-packages/autotorrent/at.py", line 308, in parse_torrent files = self.index_torrent(torrent) File "/home/mgmt/autotorrent-env/local/lib/python2.7/site-packages/autotorrent/at.py", line 232, in index_torrent orig_path = [x.decode('utf-8') for x in f[b'path'] if x] # remove empty fragments File "/home/mgmt/autotorrent-env/lib/python2.7/encodings/utf_8.py", line 16, in decode return codecs.utf_8_decode(input, errors, True) UnicodeDecodeError: 'utf8' codec can't decode byte 0xbf in position 27: invalid start byte

JohnDoee commented 9 years ago

I've been unable to figure out exactly what is wrong. It looks like the encoding is wrong.

In theory, all torrents should be proper utf-8, but reality probably doesn't align :)

Anyways, added a bit of debug logging to develop branch. If you could please update to that and try to add the torrent again with the --verbose argument added.

Just before the exception, it should spit out a line like: DEBUG:autotorrent:Handling torrent file {'path': ['testfile\xbf testtest'], 'length': 100}

That's the line I want to see. I'll work towards graceful handling of annoying encodes.

JohnDoee commented 9 years ago

Made an attempt at a fix based on what I could Google about encodings in torrents.

JohnDoee commented 9 years ago

Reopen or create a new ticket if the issue persists.