OmegaPhil / animecheck

Commandline CRC32, MD5 and eD2k-based hashing script also capable of reading and creating SFV and MD5 checksum files and generating eD2k links. This is an initial foray into python and github
GNU General Public License v3.0
46 stars 5 forks source link

UnboundLocalError on creating an SFV file for a large directory structure #13

Closed OmegaPhil closed 11 years ago

OmegaPhil commented 11 years ago
Traceback (most recent call last):
  File "/mnt/Storage_1/Desktop Files/Linux Programming/Python/animecheck/animecheck.py", line 1435, in <module>
    sfv_create_mode(args)
  File "/mnt/Storage_1/Desktop Files/Linux Programming/Python/animecheck/animecheck.py", line 1317, in sfv_create_mode
    % (checksumFileOutput, e, traceback.format_exc()))
UnboundLocalError: local variable 'checksumFileOutput' referenced before assignment
OmegaPhil commented 11 years ago

This was hiding the following error:

Traceback (most recent call last):
  File "/mnt/Storage_1/Desktop Files/Linux Programming/Python/animecheck/animecheck.py", line 1233, in sfv_create_mode
    files = recursive_file_search(files)
  File "/mnt/Storage_1/Desktop Files/Linux Programming/Python/animecheck/animecheck.py", line 445, in recursive_file_search
    for directory_path, _, directory_files in os.walk(path):
  File "/usr/lib/python2.7/os.py", line 294, in walk
    for x in walk(new_path, topdown, onerror, followlinks):
  File "/usr/lib/python2.7/os.py", line 294, in walk
    for x in walk(new_path, topdown, onerror, followlinks):
  File "/usr/lib/python2.7/os.py", line 294, in walk
    for x in walk(new_path, topdown, onerror, followlinks):
  File "/usr/lib/python2.7/os.py", line 294, in walk
    for x in walk(new_path, topdown, onerror, followlinks):
  File "/usr/lib/python2.7/os.py", line 284, in walk
    if isdir(join(top, name)):
  File "/usr/lib/python2.7/posixpath.py", line 71, in join
    path += '/' + b
UnicodeDecodeError: 'ascii' codec can't decode byte 0x82 in position 1: ordinal not in range(128)
OmegaPhil commented 11 years ago

This is happening due to invalid bytes in directory names that have come about as a result of extracting Japanese zips (presumably the zip'd data object names are not maintained in a sane coding and therefore are encoded in the Japanese locale standard).

os.walk, even though it has an onerror parameter, does not properly protect its running with try/except - this error happens due to the unicode path passed - join tries to work with the resultant directories/files as UTF-8 and fails. When a bytearray is passed (or older string that is essentially a byte array) this function works, but the script dies when attempting to later write the checksum file.

I probably need to make my own hardened walk function to detect and report on these invalid directories/files - I don't think they are in a state where I could actually deal with them properly, given how much I rely on standard path manipulation stuff in the code.

OmegaPhil commented 11 years ago

I have passed a bytearray to walk and then subsequently sanity checked the output - this works in Python 2, but needs something different for 3.