zackees / transcribe-anything

Input a local file or url and this service will transcribe it using Whisper AI. Completely private and Free 🤯🤯🤯
MIT License
419 stars 34 forks source link

insane not available #22

Closed Haxim closed 2 weeks ago

Haxim commented 2 weeks ago

Trying a fresh install from a virtualenv on Windows

trying transcribe-anything sourcefile.mp3 --device insane results in transcribe-anything: error: argument --device: invalid choice: 'insane' (choose from None, 'cpu', 'cuda')

Manually installing the following packages pip install --upgrade torch torchvision torchaudio openai-whisper insanely-fast-whisper datasets pytorch-lightning torchmetrics srtranslator numpy==1.26.4 --index-url https://download.pytorch.org/whl/cu121 does let insane be selected as an option however it crashes with an encoding error

(venv) PS C:\transcribe> transcribe-anything .\sourcefile.mp3 --device insane
Traceback (most recent call last):
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.3568.0_x64__qbz5n2kfra8p0\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.3568.0_x64__qbz5n2kfra8p0\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\transcribe\venv\Scripts\whisper.exe\__main__.py", line 7, in <module>
  File "C:\transcribe\venv\lib\site-packages\whisper\transcribe.py", line 432, in cli
    args = parser.parse_args().__dict__
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.3568.0_x64__qbz5n2kfra8p0\lib\argparse.py", line 1825, in parse_args
    args, argv = self.parse_known_args(args, namespace)
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.3568.0_x64__qbz5n2kfra8p0\lib\argparse.py", line 1858, in parse_known_args
    namespace, args = self._parse_known_args(args, namespace)
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.3568.0_x64__qbz5n2kfra8p0\lib\argparse.py", line 2067, in _parse_known_args
    start_index = consume_optional(start_index)
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.3568.0_x64__qbz5n2kfra8p0\lib\argparse.py", line 2007, in consume_optional
    take_action(action, args, option_string)
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.3568.0_x64__qbz5n2kfra8p0\lib\argparse.py", line 1935, in take_action
    action(self, namespace, argument_values, option_string)
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.3568.0_x64__qbz5n2kfra8p0\lib\argparse.py", line 1099, in __call__
    parser.print_help()
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.3568.0_x64__qbz5n2kfra8p0\lib\argparse.py", line 2555, in print_help
    self._print_message(self.format_help(), file)
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.3568.0_x64__qbz5n2kfra8p0\lib\argparse.py", line 2561, in _print_message
    file.write(message)
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.3568.0_x64__qbz5n2kfra8p0\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u3002' in position 8556: character maps to <undefined>
Traceback (most recent call last):
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.3568.0_x64__qbz5n2kfra8p0\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.3568.0_x64__qbz5n2kfra8p0\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\transcribe\venv\Scripts\transcribe-anything.exe\__main__.py", line 7, in <module>
  File "C:\transcribe\venv\lib\site-packages\transcribe_anything\cmd.py", line 15, in main
    whisper_options = parse_whisper_options()
  File "C:\transcribe\venv\lib\site-packages\transcribe_anything\parse_whisper_options.py", line 27, in parse_whisper_options
    stdout = subprocess.check_output(
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.3568.0_x64__qbz5n2kfra8p0\lib\subprocess.py", line 424, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.3568.0_x64__qbz5n2kfra8p0\lib\subprocess.py", line 528, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'whisper --help' returned non-zero exit status 1.

Insofar as I can tell, this is some kind of Windows terminal encoding error. I've tried set PYTHONIOENCODING=utf8 but that seems to have no effect. Not sure where to go from here.

Haxim commented 2 weeks ago

Upgrading python to 3.11 seems to have done the trick.