VoxelCubes / PanelCleaner

An AI-powered tool to clean manga panels.
GNU General Public License v3.0
208 stars 16 forks source link

[Windows] Unicode error with non-ascii letters in a file path #27

Closed NinjaScoutZ closed 8 months ago

NinjaScoutZ commented 9 months ago

Encountered an error while processing files.

<class 'UnicodeDecodeError'>: Traceback (most recent call last): File "pcleaner\gui\worker_thread.py", line 141, in run File "pcleaner\gui\mainwindow_driver.py", line 1064, in generate_output File "pcleaner\gui\processing.py", line 228, in generate_output File "pcleaner\preprocessor.py", line 74, in prep_json_file File "pathlib.py", line 1059, in read_text File "encodings\cp1252.py", line 23, in decode UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 352: character maps to

'charmap' codec can't decode byte 0x9d in position 352: character maps to

NinjaScoutZ commented 9 months ago

This problem will occur when I'm trying to clean up the color manga (ex.manhwa manhua)

VoxelCubes commented 9 months ago

Hm, due to the traceback coming from pathlib, I figure the issue lies with a file name of some kind. Also, I see that you're on Windows, might play a role with how windows handles character mapping, not using utf-8 etc.

To solve this issue, I'd need a file that causes the problem. If it's really just the file path, you can even try renaming a blank white page to that name and see if it produces the same error. Or, if you don't wish to share the image publicly, feel free to send the image to my email, voxel.aur@gmail.com

VoxelCubes commented 9 months ago

All right, I've taken a look at your samples, and it is as I suspected. That "encodings\cp1252.py" refers to the windows character encoding method, which is not compatible with Unicode. So this problem only happens on Windows, not other platforms, and is triggered by non-ascii (english symbols) text being interpreted wrongly.

I'll have to spend some time debugging where exactly that goes wrong, but it'll be a few weeks. In the meantime, I can advise renaming your files and folders in a way that only contains ascii characters (all parent folders above it must be ascii too, the entire file path) as a workaround, it only needs to be temporary. Or use Linux, if you wish.

VoxelCubes commented 8 months ago

Issue fixed in version 2.1.3. Let me know if it happens again.