Open thatnerdyaigirl opened 2 months ago
it uses deepspeech to convert the audio into mapped features and its probs using CPU you can get a 40% decrease in audio conversion time using gpu but i didn't figure out a way to use gpu while not affecting the speed of the frame inference as both perform better on different cuda or torch versions I cant remember which, when I did get the audio on gpu the frame inference was super slow
HI I am running gradio on 4090 rtx, for a 30 seconds audio, for the same video, it is waiting so long everytime, just when audio is different. How can I get faster results for same video?