Open lightsailpro opened 6 years ago
No, I don't believe Gentle supports GPU-based decoding. Check Kaldi docs for more information.
On Fri, Feb 2, 2018 at 11:01 AM, lightsailpro notifications@github.com wrote:
I have a K40c GPU configured. And in ext/install_kaldi.sh file, I changed use_cuda=yes, run the install_kaldi.sh, then restart the gentle server. But I GPU is still not being used during the trancription step. Does Gentle support the GPU in speech recognition step? Thanks.
/configure --static --static-math=yes --static-fst=yes --use-cuda=yes
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/lowerquality/gentle/issues/148, or mute the thread https://github.com/notifications/unsubscribe-auth/AAMup7bCI4NEeKIuO57A2jdIuHwTKPPMks5tQ1t8gaJpZM4R3oXc .
Forgot to run make after the change in install_kaldi.sh. ./configure --static --static-math=yes --static-fst=yes --use-cuda=yes --cudatk-dir=/usr/local/cuda
but "make" generated the following error:
/home/ml/gentle/ext/kaldi/src/cudamatrix/../cudamatrix/cu-array-inl.h:141: undefined reference to `cudaMemcpy'
I started to make a new issue, but thought I'd just revive the conversation here.
@strob not sure if you're aware of it, but the Kaldi team is currently working on a GPU decoder. There's a paper and a WIP PR with lots of ongoing conversation. Not entirely sure what the timetable is for this to be merged, but I've been watching pretty closely as I have a need for ultra-fast alignment. We're currently using gentle in our project, and I'm open to working on a PR here once this stuff is merged on the Kaldi side. Just wanted to get your thoughts, first, and see if this was on your radar.
Does anyone know performance improvement with this? https://medium.com/voicetube/build-gentle-w-cuda-enabled-kaldi-cb9eac86afc3
I don’t believe there are significant performance improvements decode-side but would love to see a comparison!
On Sep 3, 2019, at 12:17 AM, Steve Rogers notifications@github.com wrote:
Does anyone know performance improvement with this? https://medium.com/voicetube/build-gentle-w-cuda-enabled-kaldi-cb9eac86afc3 https://medium.com/voicetube/build-gentle-w-cuda-enabled-kaldi-cb9eac86afc3 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lowerquality/gentle/issues/148?email_source=notifications&email_token=AABS5JYQVF22NKHMEP5RXITQHYFPHA5CNFSM4EO6QXOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5XIG6A#issuecomment-527336312, or mute the thread https://github.com/notifications/unsubscribe-auth/AABS5J4XRIYY6AXDEV624KDQHYFPHANCNFSM4EO6QXOA.
I have done all these but it seems that GPU is still not being utilized.
I have successfully compiled the Gentle k3 binary against Kaldi master branch with CUDA. Unfortunately, I agree with oerdem19 that the GPU is not actually being utilised by k3. The GPU VRAM is being used, so some kind of payload is going to the GPU, but no actual calcuations are happening on the GPU.
Looking at the k3.cc source, it makes sense that nothing is happening because the CuDevice object is never used:
CuDevice &cu_device = CuDevice::Instantiate();
cu_device.SetVerbose(true);
cu_device.SelectGpuId("yes");
However, if you look at the rest of the k3 code, that cu_device
object is never used anywhere, which I think is the problem. It should be passed to some decoding function but it isn't.
Has anyone discovered any way of potentially getting the decoding to happen on GPU?
I have successfully compiled the Gentle k3 binary against Kaldi master branch with CUDA. Unfortunately, I agree with oerdem19 that the GPU is not actually being utilised by k3. The GPU VRAM is being used, so some kind of payload is going to the GPU, but no actual calcuations are happening on the GPU.
Looking at the k3.cc source, it makes sense that nothing is happening because the CuDevice object is never used:
CuDevice &cu_device = CuDevice::Instantiate(); cu_device.SetVerbose(true); cu_device.SelectGpuId("yes");
However, if you look at the rest of the k3 code, that
cu_device
object is never used anywhere, which I think is the problem. It should be passed to some decoding function but it isn't.Has anyone discovered any way of potentially getting the decoding to happen on GPU?
May I ask you how did you compile k3.cc with cuda enabled? I'm trying to use kaldi 5.5 with cuda 11 , the process seems fine but at then end the aligment doesn't work. I'm followinf all the gentle instructions for intallation, so basically the install.sh script.
In order to enable cuda , I'm using this comand for configuration of kaldi.mk in the install_kaldi.sh: . /configure --static --static-math=yes --static-fst=yes --cudatk-dir=/usr/local/cuda-11.0 --openblas-root=../tools/OpenBLAS/install
I have a K40c GPU configured. And in ext/install_kaldi.sh file, I changed use_cuda=yes, run the install_kaldi.sh, then restart the gentle server. But I GPU is still not being used during the trancription step. Does Gentle support the GPU in speech recognition step? Thanks.
/configure --static --static-math=yes --static-fst=yes --use-cuda=yes