CheshireCC / faster-whisper-GUI

faster_whisper GUI with PySide6
GNU Affero General Public License v3.0
1.69k stars 104 forks source link

使用显卡(CUDA)时转写报错 #222

Open aagaguai opened 2 months ago

aagaguai commented 2 months ago

使用的版本号 0.8.1 转写时实时日志框一直不动

以下是faster-whisper报错日志信息 `None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used. None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used. The torchaudio backend is switched to 'soundfile'. Note that 'sox_io' is not supported on Windows. torchvision is not available - cannot save figures The torchaudio backend is switched to 'soundfile'. Note that 'sox_io' is not supported on Windows.

faster_whisper_GUI: 0.8.1 ==========2024-09-17_17:23:29========== ==========Start==========

current computer language region-format: zh_CN language: zh

==========2024-09-17_17:24:54========== ==========LoadModel==========

-model_size_or_path: I:/CheshireCC-faster-whisper-large-v3-float32
-device: cuda
-device_index: 0
-compute_type: float32
-cpu_threads: 8
-num_workers: 1
-download_root: C:/Users/Administrator/.cache/huggingface/hub
-local_files_only: False
-use_v3_model: True

Load over I:/CheshireCC-faster-whisper-large-v3-float32 max_length: 448 num_samples_per_token: 320 time_precision: 0.02 tokens_per_second: 50 input_stride: 2

[Using V3 model, modify number of mel-filters to 128]

==========2024-09-17_17:25:30========== ==========Process==========

redirect std output vad_filter : True -threshold : 0.2 -min_speech_duration_ms : 250 -max_speech_duration_s : inf -min_silence_duration_ms : 2000 -speech_pad_ms : 800 Transcribes options: -audio : ['J:/vocals.wav'] -language : ja -task : False -beam_size : 1 -best_of : 5 -patience : 1.0 -length_penalty : 1.0 -temperature : [0.0, 0.2, 0.4, 0.6, 0.8, 1.0] -compression_ratio_threshold : 1.4 -log_prob_threshold : -10.0 -no_speech_threshold : 0.9 -condition_on_previous_text : False -initial_prompt : None -prefix : None -suppress_blank : True -suppress_tokens : [-1] -without_timestamps : False -max_initial_timestamp : 1.0 -word_timestamps : True -prepend_punctuations : "'“¿([{- -append_punctuations : "'.。,,!!??::”)]}、 -repetition_penalty : 1.0 -no_repeat_ngram_size : 0 -prompt_reset_on_temperature : 0.5 -max_new_tokens : None -chunk_length : 30.0 -clip_mode : 0 -clip_timestamps : 0 -hallucination_silence_threshold : 0.5 -hotwords : -language_detection_threshold : None -language_detection_segments : 1 create transcribe process with 1 workers start transcribe process Traceback (most recent call last): File "C:\RJ\FASTER~1\faster_whisper_GUI\transcribe.py", line 369, in run File "C:\RJ\FASTER~1\concurrent\futures_base.py", line 621, in result_iterator File "C:\RJ\FASTER~1\concurrent\futures_base.py", line 319, in _result_or_cancel File "C:\RJ\FASTER~1\concurrent\futures_base.py", line 458, in result File "C:\RJ\FASTER~1\concurrent\futures_base.py", line 403, in __get_result File "C:\RJ\FASTER~1\concurrent\futures\thread.py", line 58, in run File "C:\RJ\FASTER~1\faster_whisper_GUI\transcribe.py", line 279, in transcribe_file File "C:\RJ\FASTER~1\faster_whisper\transcribe.py", line 1189, in restore_speech_timestamps File "C:\RJ\FASTER~1\faster_whisper\transcribe.py", line 587, in generate_segments File "C:\RJ\FASTER~1\faster_whisper\transcribe.py", line 838, in encode RuntimeError: parallel_for failed: cudaErrorNoKernelImageForDevice: no kernel image is available for execution on the device `

CheshireCC commented 2 months ago

收到,立即进行测试