CheshireCC / faster-whisper-GUI

faster_whisper GUI with PySide6
GNU Affero General Public License v3.0
1.41k stars 84 forks source link

0.8.1转写速度非常慢 #227

Open syazyz opened 2 days ago

syazyz commented 2 days ago



None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used. None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used. The torchaudio backend is switched to 'soundfile'. Note that 'sox_io' is not supported on Windows. torchvision is not available - cannot save figures The torchaudio backend is switched to 'soundfile'. Note that 'sox_io' is not supported on Windows.

faster_whisper_GUI: 0.8.1 ==========2024-09-19_18:15:41========== ==========Start==========

current computer language region-format: zh_CN language: zh

==========2024-09-19_18:16:57========== ==========LoadModel==========

-model_size_or_path: E:/Utilities/Huggingface_Model/models--Systran--faster-whisper-large-v3/snapshots/edaa852ec7e145841d8ffdb056a99866b5f0a478
-device: cuda
-device_index: 0
-compute_type: float32
-cpu_threads: 4
-num_workers: 1
-download_root: C:/Users/syazyz/.cache/huggingface/hub
-local_files_only: False
-use_v3_model: True

Load over E:/Utilities/Huggingface_Model/models--Systran--faster-whisper-large-v3/snapshots/edaa852ec7e145841d8ffdb056a99866b5f0a478 max_length: 448 num_samples_per_token: 320 time_precision: 0.02 tokens_per_second: 50 input_stride: 2

[Using V3 model, modify number of mel-filters to 128]

==========2024-09-19_18:18:05========== ==========Process==========

redirect std output vad_filter : True -threshold : 0.2 -min_speech_duration_ms : 250 -max_speech_duration_s : inf -min_silence_duration_ms : 2000 -speech_pad_ms : 800 Transcribes options: -audio : ['E:/VideoDownload/you-get/겜스트GAMEST/20240912 - 감스트 즉흥으로 열린 노래자랑, 과연 참가자 실력은? [24.9.11].mkv'] -language : None -task : True -beam_size : 1 -best_of : 5 -patience : 1.0 -length_penalty : 1.0 -temperature : [0.0, 0.2, 0.4, 0.6, 0.8, 1.0] -compression_ratio_threshold : 1.4 -log_prob_threshold : -10.0 -no_speech_threshold : 0.9 -condition_on_previous_text : False -initial_prompt : None -prefix : None -suppress_blank : True -suppress_tokens : [-1] -without_timestamps : False -max_initial_timestamp : 1.0 -word_timestamps : True -prepend_punctuations : "'“¿([{- -append_punctuations : "'.。,,!!??::”)]}、 -repetition_penalty : 1.0 -no_repeat_ngram_size : 0 -prompt_reset_on_temperature : 0.5 -max_new_tokens : None -chunk_length : 30.0 -clip_mode : 0 -clip_timestamps : 0 -hallucination_silence_threshold : 0.5 -hotwords : -language_detection_threshold : None -language_detection_segments : 1 create transcribe process with 1 workers start transcribe process

CheshireCC commented 1 day ago

转写速度和显存有一定关系,显存不够的时候会调用系统内存作为共享显存来进行缓冲,速度就慢了,系统内存比显存慢得多,解决办法就是把计算精度换成 16 位或者 8位,这样占用显存少了数据就能直接进显存了

lowy-git commented 1 day ago
