Closed awsomecod closed 2 years ago
Transcribing a single audio file will probably be done in a second or so. Models don't take so much memory during inference.
This audio file is 50 mega byte and is 21 minutes long.
I guess, the inference is running on CPU instead of GPU even though I am using cuda()
function in my codes. The inference takes about 70 seconds.
I am running on colab and I have installed Apex. But I receive a warning saying that Kaldi root is not found. Is this the source if issue?
That's not used. Colab can't monitor Nvidia smi in parallel when code is run as far as I know. Even then, you would need to use watch -n 1 nvidia-smi to make it work.
I use to keep typing nvidia-smi
over and over, but using watch -n 1 nvidia-smi
is definitely more convenient.
I want to know how much GPU RAM do I need to run the inference on GPU for a single audio file of 50 MB. Can you let me know the answer. Cause I don't know any other command beside nvidia-smi
to run on colab.
Depends on too many factors like length of audio in seconds and fp16 or 32 etc. You'll need to find out some other way. QuartzNet with AMP and 32 GB can easily do 2 hour long audio clips.
I run the following commands:
But GPU utilization is zero as it can be seen from the output of
nvidia-smi
: