Closed furqan4545 closed 1 year ago
ts_num
was used for version 1.0 of Stable-ts to improve the timing, but it's experimental in the current version 2.0. The higher the value you use, the more memory it will use. Generally, avoid using that argument for best results.
Thank you so much for response. result2 = model.transcribe('tate_pier.mp3', mel_first=True,demucs=True) result2 = model.transcribe('tate_pier.mp3', mel_first=True)
I used different settings as shown above. Also with VAD and without VAD. the accuracy is not as great as original whisper. It is missing alot of words sometime and I am using large-v2 model but still... Could you please tell me if there is any specific parameter which I can use so that it doesn't miss the words. Your help will be highly appreciated.
So I am using large-v2 model and when I set these parameters which I am showing in the picture. The cuda out of memory error is thrown, is there any leakage or something wrong?
Also the transcription is not as accurate as original whisper model.