Closed aryamansriram closed 3 years ago
This is discussed in #1486, there is no built in way to do it but instead you need to write your own code to convert the mic input into a tensor that you pass directly to the model (numpy.frombuffer might be helpful). You then need to pass that tensor as input to your saved nemo model (you wont be able to use the transcribe function but instead will have to decode manually as is done in validate
in the asr_tutorial). Don't forget to put your model into eval mode (just quartznet.eval() if using quartznet) as the asr_tutorial has a bug and doesn't do this, and it causes training augmentations to be applied during inference.
I'm happy to try to help and answer any more questions if you have them. Good luck.
Thanks @rbracco for the reply, closing this issue now, I'll try to implement it myself
This is discussed in #1486, there is no built in way to do it but instead you need to write your own code to convert the mic input into a tensor that you pass directly to the model (numpy.frombuffer might be helpful). You then need to pass that tensor as input to your saved nemo model (you wont be able to use the transcribe function but instead will have to decode manually as is done in
validate
in the asr_tutorial). Don't forget to put your model into eval mode (just quartznet.eval() if using quartznet) as the asr_tutorial has a bug and doesn't do this, and it causes training augmentations to be applied during inference.I'm happy to try to help and answer any more questions if you have them. Good luck.
Can you probably elaboate a little bit more and provide with your notebook if already implemented?
Thanks @rbracco for the reply, closing this issue now, I'll try to implement it myself
Hi. Have you implemented the conversion audio from mic input into tensor? I need it very much. Can we discuss it?
Please do not reply to long closed threads, open a new one c
Also, we have not yet implemented audio from mic.
Describe your question
I want to transcribe audio input from a microphone without saving it as an audio file
Environment overview (please complete the following information)
Environment details OS Version: Ubuntu 18.04 Python version: 3.7 Pytorch Version: 1.7.1 If NVIDIA docker image is used you don't need to specify these.