abdeladim-s / subsai

🎞️ Subtitles generation tool (Web-UI + CLI + Python package) powered by OpenAI's Whisper and its variants 🎞️
https://abdeladim-s.github.io/subsai/
GNU General Public License v3.0
1.15k stars 96 forks source link

wrong encoding crashes the app when trying to print the __header__ #106

Closed FernandoCamporredondo closed 5 months ago

FernandoCamporredondo commented 5 months ago

I found out that if you are executing subsai from a nodejs app, while your windows machine encoding is set to japanese (cp932), header will give a error message "cp932' codec can't encode character '\u2588'", when the library tries to print header in the line 118 of the file cli.py

I don't know python, but from what I have been searching on stackoverflow, adding ".encode('utf-8')` to the header seems to fix it on my machine.

__header__ = f"""
███████╗██╗   ██╗██████╗ ███████╗     █████╗ ██╗
██╔════╝██║   ██║██╔══██╗██╔════╝    ██╔══██╗██║
███████╗██║   ██║██████╔╝███████╗    ███████║██║
╚════██║██║   ██║██╔══██╗╚════██║    ██╔══██║██║
███████║╚██████╔╝██████╔╝███████║    ██║  ██║██║
╚══════╝ ╚═════╝ ╚═════╝ ╚══════╝    ╚═╝  ╚═╝╚═╝

Subs AI: Subtitles generation tool powered by OpenAI's Whisper and its variants.
Version: {__version__}               
===================================
""".encode('utf-8')
abdeladim-s commented 5 months ago

Yes, I forget to take encoding into consideration, I applied the changes accordingly. Thanks a lot @FernandoCamporredondo for pointing that out.

FernandoCamporredondo commented 4 months ago

@abdeladim-s

I recently found out that if a file name countains some characters, like "–", it will also crash

I made some changes that worked for me. Here is the pull request with them https://github.com/abdeladim-s/subsai/pull/112

Without those changes, the error message that I was getting was

Traceback (most recent call last): File "C:\Users\Usuario\.pyenv\pyenv-win\versions\3.10.0\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\Usuario\.pyenv\pyenv-win\versions\3.10.0\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "C:\Users\Usuario\.pyenv\pyenv-win\versions\3.10.0\Scripts\subsai.exe\__main__.py", line 7, in <module> File "C:\Users\Usuario\.pyenv\pyenv-win\versions\3.10.0\lib\site-packages\subsai\cli.py", line 149, in main run(media_file_arg=args.media_file, File "C:\Users\Usuario\.pyenv\pyenv-win\versions\3.10.0\lib\site-packages\subsai\cli.py", line 84, in run print(f"[+] Processing file: {file}") UnicodeEncodeError: 'cp932' codec can't encode character '\u2013' in position 172: illegal multibyte sequence

abdeladim-s commented 4 months ago

Yeah another encoding issue. The PR seems perfect and has been merged. Thanks @FernandoCamporredondo for the contribution.