Closed rohithkodali closed 4 years ago
Yes, that's why is it run on CPU by default. There needs to be a function for batching the data for inference and it isn't there yet, so that's why.
okay thanks for the info, when you mentioned i README that you have tried on 1080 GPU i thought it was supported.
It is for other demos but not this one, because there is a lot of data to process at once. It is technically possible to run it on GPU but I didn't write the code for it.
I was trying to run diarization script on 50 minutes of audio and it consumes ~15 GB RAM of my laptop. Is there any way to cut short my RAM usage or any alternative to avoid OOM error for long audios?
Yes, you should batch your data. Proceed by chunks of say 30 seconds, where you compute the speaker embeddings for each and only retain an array indicating which speaker is currently talking. Discard the speaker embeddings and repeat for the next batch. You can do this on GPU for a great speed gain.
can you let me know how to batch the data?
I Have tried to load the model on gtx 1080 GPU and run it but it is asking for a whole lot of memory this is the error it throws
Traceback (most recent call last): File "/home/server/anaconda3/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3326, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "", line 1, in
runfile('/media/server/b92be869-bd56-4ed4-9306-12a754f7065f/diarization-package/Resemblyzer/demo02_diarization.py', wdir='/media/server/b92be869-bd56-4ed4-9306-12a754f7065f/diarization-package/Resemblyzer')
File "/home/server/pycharm-community-2019.2.4/helpers/pydev/_pydev_bundle/pydev_umd.py", line 197, in runfile
pydev_imports.execfile(filename, global_vars, local_vars) # execute the script
File "/home/server/pycharm-community-2019.2.4/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/media/server/b92be869-bd56-4ed4-9306-12a754f7065f/diarization-package/Resemblyzer/demo02_diarization.py", line 64, in
run()
File "/media/server/b92be869-bd56-4ed4-9306-12a754f7065f/diarization-package/Resemblyzer/demo02diarization.py", line 46, in run
, cont_embeds, wav_splits = encoder.embed_utterance(wav, return_partials=True, rate=16)
File "/media/server/b92be869-bd56-4ed4-9306-12a754f7065f/diarization-package/Resemblyzer/resemblyzer/voice_encoder.py", line 152, in embed_utterance
partial_embeds = self(mels).cpu().numpy()
File "/home/server/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, *kwargs)
File "/media/server/b92be869-bd56-4ed4-9306-12a754f7065f/diarization-package/Resemblyzer/resemblyzer/voiceencoder.py", line 57, in forward
, (hidden, _) = self.lstm(mels)
File "/home/server/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(input, **kwargs)
File "/home/server/anaconda3/lib/python3.7/site-packages/torch/nn/modules/rnn.py", line 564, in forward
return self.forward_tensor(input, hx)
File "/home/server/anaconda3/lib/python3.7/site-packages/torch/nn/modules/rnn.py", line 543, in forward_tensor
output, hidden = self.forward_impl(input, hx, batch_sizes, max_batch_size, sorted_indices)
File "/home/server/anaconda3/lib/python3.7/site-packages/torch/nn/modules/rnn.py", line 526, in forward_impl
self.dropout, self.training, self.bidirectional, self.batch_first)
RuntimeError: CUDA out of memory. Tried to allocate 27.50 GiB (GPU 0; 7.93 GiB total capacity; 4.17 GiB already allocated; 3.24 GiB free; 22.08 MiB cached)