l3uddz / plex_dupefinder

Find and delete duplicate files in Plex
GNU General Public License v3.0
309 stars 54 forks source link

Unicode errors abound #21

Closed Anaerin closed 5 years ago

Anaerin commented 5 years ago

Getting a lot of these errors:

--- Logging error ---
Traceback (most recent call last):
  File "C:\Program Files\Python35\lib\logging\__init__.py", line 982, in emit
    stream.write(msg)
  File "C:\Program Files\Python35\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2605' in position 80: character maps to <undefined>
Call stack:
  File "plexdupes.py", line 359, in <module>
    log.info("Processing: %r", title)
Message: 'Processing: %r'
Arguments: ("The 100 - 01x01 - The 100 - S01E01 'Pilot' 720p  \u2605L@\u266bBerT\u2605",)
--- Logging error ---
Traceback (most recent call last):
  File "C:\Program Files\Python35\lib\logging\__init__.py", line 982, in emit
    stream.write(msg)
  File "C:\Program Files\Python35\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u0101' in position 56: character maps to <undefined>
Call stack:
  File "plexdupes.py", line 359, in <module>
    log.info("Processing: %r", title)
Message: 'Processing: %r'
Arguments: ('Hawaii Five-0 - 05x16 - N\u0101nahu',)
--- Logging error ---
Traceback (most recent call last):
  File "C:\Program Files\Python35\lib\logging\__init__.py", line 982, in emit
    stream.write(msg)
  File "C:\Program Files\Python35\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u0101' in position 61: character maps to <undefined>
Call stack:
  File "plexdupes.py", line 359, in <module>
    log.info("Processing: %r", title)
Message: 'Processing: %r'
Arguments: ("Hawaii Five-0 - 05x20 - 'Ike H\u0101nau",)
--- Logging error ---
Traceback (most recent call last):
  File "C:\Program Files\Python35\lib\logging\__init__.py", line 982, in emit
    stream.write(msg)
  File "C:\Program Files\Python35\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u016b' in position 61: character maps to <undefined>
Call stack:
  File "plexdupes.py", line 359, in <module>
    log.info("Processing: %r", title)
Message: 'Processing: %r'
Arguments: ('Hawaii Five-0 - 09x08 - Lele p\u016b n\u0101 manu like',)
--- Logging error ---
Traceback (most recent call last):
  File "C:\Program Files\Python35\lib\logging\__init__.py", line 982, in emit
    stream.write(msg)
  File "C:\Program Files\Python35\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u03c0' in position 61: character maps to <undefined>
Call stack:
  File "plexdupes.py", line 359, in <module>
    log.info("Processing: %r", title)
Message: 'Processing: %r'
Arguments: ('Person of Interest - 02x11 - 2πR',)

Some processing does occur, then it crashes with the error:

Traceback (most recent call last):
  File "plexdupes.py", line 377, in <module>
    print("\nWhich media item do you wish to keep for %r ?\n" % item)
  File "C:\Program Files\Python35\lib\encodings\cp437.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2605' in position 93: character maps to <undefined>

Any way to fix this?

Anaerin commented 5 years ago

Note: This only happens in Windows. Probably Python being too cautious with it's print() function, as Windows 10's console now supports Unicode (and doesn't need to be converted to CP1252, which is where this seems to be happening)

desimaniac commented 5 years ago

Do you have any solutions?

Anaerin commented 5 years ago

Apparently, the answer is "Update to Python 3.6 or higher": https://www.python.org/dev/peps/pep-0528/

desimaniac commented 5 years ago

Thanks

bbakermmc commented 4 years ago

Using python 3.8 and Pycharm it doesnt work:

Found 950 dupes for section 'Movies'
--- Logging error ---
Traceback (most recent call last):
  File "C:\Users\bbaker\AppData\Local\Programs\Python\Python38-32\lib\logging\__init__.py", line 1084, in emit
    stream.write(msg + self.terminator)
  File "C:\Users\bbaker\AppData\Local\Programs\Python\Python38-32\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 352-355: character maps to <undefined>
Call stack:
  File "C:/Users/bbaker/Desktop/plex_dupefinder-master/plex_dupefinder.py", line 364, in <module>
    log.info("ID: %r - Score: %s - Meta:\n%r", part.id, part_info.get('score', 'N/A'),
Message: 'ID: %r - Score: %s - Meta:\n%r'
Arguments: (869094, 64750, {'id': 869094, 'video_bitrate': 0, 'audio_codec': 'Unknown', 'audio_channels': 0, 'video_codec': 'Unknown', 'video_resolution': 'Unknown', 'video_width': 0, 'video_height': 0, 'video_duration': 0, 'file': ['/movies/Armour of God 1986 Bluray-1080p.AAC.x264.龙兄虎弟.mp4'], 'multipart': False, 'file_size': 6475069301, 'score': 64750, 'show_key': '/library/metadata/398766'})
bbakermmc commented 4 years ago

This looks like it fixes the error:


# Setup logger
log_filename = os.path.join(os.path.dirname(os.path.realpath(sys.argv[0])), 'activity.log')
logging.basicConfig(
    # filename=log_filename,
    handlers=[logging.FileHandler(log_filename, 'w', 'utf-8')],
    level=logging.DEBUG,
    format='[%(asctime)s] %(levelname)s - %(message)s',
    datefmt='%H:%M:%S'
)