Open Mayonezyck opened 4 weeks ago
I'm not entirely sure. Maybe you can check out this issue for potential solutions. https://github.com/microsoft/onnxruntime/issues/18973
The problem happens in the inference stage of the speech recognition module after voice activation detection. Which voice recognition are you using? Are you using faster-whisper?
If you are using faster-whisper, maybe check their documentation to see if anything is missing. I should probably add this to my documentation...
I'm working on dockerizing this program with Nvidia GPU passthrough, which may potentially solve your problem.
Cool! Great to hear! Thanks for your reply! I have been stepping through and trying to see which part it got stuck in. Yes I'm using the faster-whisper as default. Apparently it got stuck in the function
transcribe_np(self, audio: np.ndarray) -> str:
segments, info = self.model.transcribe( audio, beam_size=5 if self.BEAM_SEARCH else 1, language=self.LANG, condition_on_previous_text=False, )
Cool! Great to hear! Thanks for your reply! I have been stepping through and trying to see which part it got stuck in. Yes I'm using the faster-whisper as default. Apparently it got stuck in the function
transcribe_np(self, audio: np.ndarray) -> str:
segments, info = self.model.transcribe( audio, beam_size=5 if self.BEAM_SEARCH else 1, language=self.LANG, condition_on_previous_text=False, )
Probably it's because I need to install the specific CUDA and CUDNN for my 2060... Can you please share your CUDA and Cudnn versions?
Well... I'm using an apple silicon Mac so I don't use cuda. I hadn't actually tried running this project on an Nvidia machine yet
I created the dockerfile and added some docs in the readme for the Nvidia GPU passthrough container. It uses cuda:11.2.2-cudnn8. However, I haven't had the chance to test it. If you feel stuck on fixing Cuda issues, maybe you can take some inspiration from it or just help me test the Nvidia container, which still has a lot of issues, but they are a different set of issues, I guess... By the way, let me know when your issue is resolved.
I will let you know how that goes! Meanwhile, I'm going to test it on my M1 laptop. I will try out docker. Will let you know!
update! May not be helpful. I didn't have a chance to try the docker image yet. But your repo works for my 4090 setup on a ubuntu20.04 system where cuda and cudnn are correctly set up. So it's a user error on my windows computer!
Note that there are cudnn 8 and cudnn 9. The command to install onnxruntime for cuda 11 and 12 are different. See the following for detail: https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements
onnxruntime-gpu for cuda 11 need cudnn 8, you will need pip install nvidia-cudnn-cu11==8.9.7.29
onnxruntime-gpu 1.8.1 for cuda 12 need cudnn 9, older version use cudnn 8. For cudnn 9, you can install like pip install nvidia-cudnn-cu12==9.2.1.18
2024-07-06 20:14:52.028 | INFO | asr.asr_with_vad:_process_detected_audio:222 - Detected pause after speech. Processing... 2024-07-06 20:14:52.028 | INFO | asr.asr_with_vad:_process_detected_audio:224 - Stopping listening... Could not locate cudnn_cnn_infer64_8.dll. Please make sure it is in your library path!
Above is the error code. Platform: Windows 10 Graphic card: NVIDIA-2060 Cudnn installed using
py -m pip install nvidia-cudnn-cu12