CheshireCC / faster-whisper-GUI

faster_whisper GUI with PySide6
GNU Affero General Public License v3.0
1.66k stars 103 forks source link

处理的非常慢,gpu占用很低 #87

Closed AiDreamerOoO closed 9 months ago

AiDreamerOoO commented 9 months ago

我是4080,处理1个2分钟左右的视频,都要好久,二十几分钟?gpu占用很低,感觉都没在用,模型参数那里我设置4或者128,感觉没啥区别,处理设备选的cuda

AiDreamerOoO commented 9 months ago

fasterwhispergui.log里的数据

The torchaudio backend is switched to 'soundfile'. Note that 'sox_io' is not supported on Windows. torchvision is not available - cannot save figures The torchaudio backend is switched to 'soundfile'. Note that 'sox_io' is not supported on Windows.

faster_whisper_GUI: 0.5.7 ==========2024-02-04_18:24:39========== ==========Start==========

language: zh

==========2024-02-04_18:24:49========== ==========LoadModel==========

-model_size_or_path: E:/fasterwhisper-gui/FasterWhisperGUI/model
-device: cuda
-device_index: 0
-compute_type: float32
-cpu_threads: 96
-num_workers: 96
-download_root: E:/fasterwhisper-gui/FasterWhisperGUI/model
-local_files_only: False
-use_v3_model: False

Load over E:/fasterwhisper-gui/FasterWhisperGUI/model max_length: 448 num_samples_per_token: 320 time_precision: 0.02 tokens_per_second: 50 input_stride: 2

==========2024-02-04_18:25:01========== ==========LoadModel==========

-model_size_or_path: E:/fasterwhisper-gui/FasterWhisperGUI/model
-device: cuda
-device_index: 0
-compute_type: float32
-cpu_threads: 128
-num_workers: 128
-download_root: E:/fasterwhisper-gui/FasterWhisperGUI/model
-local_files_only: False
-use_v3_model: False

Load over E:/fasterwhisper-gui/FasterWhisperGUI/model max_length: 448 num_samples_per_token: 320 time_precision: 0.02 tokens_per_second: 50 input_stride: 2

==========2024-02-04_18:26:25========== ==========Process==========

redirect std output vad_filter : True -threshold : 0.5 -min_speech_duration_ms : 50 -max_speech_duration_s : inf -min_silence_duration_ms : 2000 -window_size_samples : 1024 -speech_pad_ms : 400 Transcribes options: -audio : ['E:/英文短剧/测试2/001.mp4'] -language : zh -task : False -beam_size : 5 -best_of : 1 -patience : 1.0 -length_penalty : 1.0 -temperature : [0.0] -compression_ratio_threshold : 2.4 -log_prob_threshold : -1.0 -no_speech_threshold : 0.5 -condition_on_previous_text : False -initial_prompt : None -prefix : None -suppress_blank : True -suppress_tokens : [-1] -without_timestamps : False -max_initial_timestamp : 10.0 -word_timestamps : False -prepend_punctuations : "'“¿([{- -append_punctuations : "'.。,,!!??::”)]}、 -repetition_penalty : 1.0 -no_repeat_ngram_size : 0 -prompt_reset_on_temperature : 0.5 create transcribe process with 128 workers start transcribe process Traceback (most recent call last): File "E:\fasterwhisper-gui\FasterWhisperGUI\faster_whisper_GUI\transcribe.py", line 354, in run for path, results in zip(files, results): File "E:\fasterwhisper-gui\FasterWhisperGUI\concurrent\futures_base.py", line 621, in result_iterator File "E:\fasterwhisper-gui\FasterWhisperGUI\concurrent\futures_base.py", line 319, in _result_or_cancel File "E:\fasterwhisper-gui\FasterWhisperGUI\concurrent\futures_base.py", line 458, in result File "E:\fasterwhisper-gui\FasterWhisperGUI\concurrent\futures_base.py", line 403, in __get_result File "E:\fasterwhisper-gui\FasterWhisperGUI\concurrent\futures\thread.py", line 58, in run File "E:\fasterwhisper-gui\FasterWhisperGUI\faster_whisper_GUI\transcribe.py", line 267, in transcribe_file for segment in segments: File "E:\fasterwhisper-gui\FasterWhisperGUI\faster_whisper\transcribe.py", line 941, in restore_speech_timestamps File "E:\fasterwhisper-gui\FasterWhisperGUI\faster_whisper\transcribe.py", line 445, in generate_segments File "E:\fasterwhisper-gui\FasterWhisperGUI\faster_whisper\transcribe.py", line 629, in encode ValueError: Invalid input features shape: expected an input with shape (1, 128, 3000), but got an input with shape (1, 80, 3000) instead

AiDreamerOoO commented 9 months ago

问题解决了,没有勾选v3模型,选项,我是用的v3模型

CheshireCC commented 9 months ago

ok

AiDreamerOoO commented 9 months ago

ok 大佬,新问题来了,本来字幕显示到5秒,但是4秒左右就没有了,这种情况还挺多,是我哪里设置的不对,还是就是bug