bkettle / code-listing-generator

quickly generate PDFs of code across multiple files
MIT License
3 stars 0 forks source link

UnicodeDecodeError #6

Open lclarkg18 opened 1 year ago

lclarkg18 commented 1 year ago

I'm getting this error when attempting to run the program. Ideally, when faced with an unreadable character it would skip the file and raise a warning. Alternatively, it would be great if the error could state the file name and line within the file of the character causing trouble as it could often be something small but it's very hard to go file by file checking.

Traceback (most recent call last):
  File "/usr/local/bin/generate-code-listing", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/usr/local/lib/python3.11/site-packages/code_listing_generator/__init__.py", line 108, in main
    generate(name, title, recursive=args.recursive, prompt=args.prompt, make=args.make, copy=args.copy, print_tex=print_tex)
  File "/usr/local/lib/python3.11/site-packages/code_listing_generator/__init__.py", line 69, in generate
    file_contents = f.read()
                    ^^^^^^^^
  File "<frozen codecs>", line 322, in decode
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 548: invalid start byte

Really appreciate the library it's great!

lclarkg18 commented 1 year ago

ChatGPT generated this code for me to ignore the characters and it seems to have worked very well because the files have been added as listing correctly as far as I can tell. Not sure how it works though.

def gen_filenames(prompt=False, path=".", depth=0, recursive=False):
    # iterate through all files and directories in the selected directory
    # save files and dirs
    this_dir_files = []
    this_dir_subdirs = []
    for filename in os.listdir(path=path):
        filename = os.path.join(path, filename)

        if os.path.isfile(filename):
            this_dir_files.append(filename)
        if os.path.isdir(filename) and recursive:
            this_dir_subdirs.append(filename)

    # ask user to confirm each file in this directory
    for filename in this_dir_files:
        try:
            with open(filename, "r") as f:
                file_contents = f.read()
        except UnicodeDecodeError:
            print(f"Cannot decode file {filename}. Skipping.")
            continue

        if prompt:
            if input("| "*depth + f"include file {filename}? (y) ") != 'y':
                continue
        yield filename

    # ask user if we should go into the next dir
    for dirname in this_dir_subdirs:
        if prompt:
            if input("| "*depth + f"enter directory {dirname}? (y) ") != 'y':
                continue
        # yield each file that the user chooses to accept from a lower level
        dirname = os.path.join(dirname)
        for filename in gen_filenames(prompt=prompt, path=dirname, depth=depth+1, recursive=recursive):
            yield filename