m-bain / whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
BSD 2-Clause "Simplified" License
12.69k stars 1.35k forks source link

vad_options={"vad_onset": vad_onset, "vad_offset": vad_offset} #433

Closed soberupkg closed 1 year ago

soberupkg commented 1 year ago

vad_options={"vad_onset": vad_onset, "vad_offset": vad_offset},Hello, the test parameters are not effective, please help to check。 model = whisperx.load_model("/data/model/"+model_path, device, compute_type=compute_type, asr_options={"condition_on_previous_text": True,"initial_prompt":'hi'},vad_options={"vad_onset": 0.001, "vad_offset": 0.001})

image Vad caused here, no voice detected,Although I set {"vad_onset": 0.001, "vad_offset": 0.001}, it does not take effect You can adjust {"vad_onset": 0.001, "vad_offset": 0.001} to model.transcribe(audio, batch_size=batch_size, language='es', vad_options={"vad_onset": 0.001, "vad_offset": 0.001}),Of course, this is just a suggestion. You can modify it according to your project plan

crisprin17 commented 1 year ago

@soberupkg Not sure I understand your message, but I am also not seeing a difference when I change these parameters. Did you figure out why?