JohnDoee / autotorrent

Matches torrents with files and gets them seeded
MIT License
269 stars 34 forks source link

UnicodeEncodeError: 'utf-8' codec can't encode characters in position 17-22: surrogates not allowed #55

Open datawhores opened 2 years ago

datawhores commented 2 years ago

Can we utilize one of these methods surrogatepass, replaces, surrogateescape

https://docs.python.org/3/library/codecs.html

I ended up using surrogatepass, and it seems to have fix the issue at the very least the program doesn't crash

samad909 commented 2 years ago

@excludedBittern8 Could you let us know how you fixed it?

For those who are stuck with this issue here are some ways to get it working, Python2 Solution from issue #35 Add the following line to the normalize_filename() function before the return statement in db.py, filename = filename.decode('utf-8')

Python3 A fork of this code patches 3 files, patch your install accordingly or just use that fork - https://github.com/shoghicp/autotorrent/commit/fef9f6a82d6f18dbf51d49b55600fdd51fbad520

datawhores commented 2 years ago
def keyify(self, size, *names):
    """
    Turns a name and size into a key that can be stored in the database.
    """
    key = '%s|%s' % (size, '|'.join(names))
    logger.debug('Keyify: %s' % key)

    return hashlib.sha256(key.encode('utf-8',errors="surrogatepass")).hexdigest()