Closed dancinkid6 closed 2 weeks ago
@dancinkid6 , hello. You should compare the audio value in here when writing data to a file with the audio_data
in your example above. BTW, could you show full code to reproduce this problem ?
i fixed it by going in and read the VAD code, end up just need to flatten the audio_data,
I am trying to transcribe real time audio with sounddevice and faster whisper.
instead of saving it to a temp file, i want to just pass the recorded numpy array directly to the model. but somehow i just cannot get it to work.
this just produces "ValueError: Input audio chunk is too short."
the audio chunks have no problem and it even works super good if i write it into a file first using soundfiles then transcribe with faster-whisper. but when i directly pass it to the model, it just breaks. the audio data is confirmed to be fp16 with no problem. i read something about it's the vad filter problem, and i played around with the parameters, it never worked.
it has been bothering me for the past 2 days. can anyone help? thanks