lowerquality / gentle

gentle forced aligner
https://lowerquality.com/gentle/
MIT License
1.46k stars 295 forks source link

How to enable CUDA on Kaldi in transcrbing step #148

Open lightsailpro opened 6 years ago

lightsailpro commented 6 years ago

I have a K40c GPU configured. And in ext/install_kaldi.sh file, I changed use_cuda=yes, run the install_kaldi.sh, then restart the gentle server. But I GPU is still not being used during the trancription step. Does Gentle support the GPU in speech recognition step? Thanks.

/configure --static --static-math=yes --static-fst=yes --use-cuda=yes

strob commented 6 years ago

No, I don't believe Gentle supports GPU-based decoding. Check Kaldi docs for more information.

On Fri, Feb 2, 2018 at 11:01 AM, lightsailpro notifications@github.com wrote:

I have a K40c GPU configured. And in ext/install_kaldi.sh file, I changed use_cuda=yes, run the install_kaldi.sh, then restart the gentle server. But I GPU is still not being used during the trancription step. Does Gentle support the GPU in speech recognition step? Thanks.

/configure --static --static-math=yes --static-fst=yes --use-cuda=yes

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/lowerquality/gentle/issues/148, or mute the thread https://github.com/notifications/unsubscribe-auth/AAMup7bCI4NEeKIuO57A2jdIuHwTKPPMks5tQ1t8gaJpZM4R3oXc .

lightsailpro commented 6 years ago

Forgot to run make after the change in install_kaldi.sh. ./configure --static --static-math=yes --static-fst=yes --use-cuda=yes --cudatk-dir=/usr/local/cuda

but "make" generated the following error:

/home/ml/gentle/ext/kaldi/src/cudamatrix/../cudamatrix/cu-array-inl.h:141: undefined reference to `cudaMemcpy'

adamdottv commented 6 years ago

I started to make a new issue, but thought I'd just revive the conversation here.

@strob not sure if you're aware of it, but the Kaldi team is currently working on a GPU decoder. There's a paper and a WIP PR with lots of ongoing conversation. Not entirely sure what the timetable is for this to be merged, but I've been watching pretty closely as I have a need for ultra-fast alignment. We're currently using gentle in our project, and I'm open to working on a PR here once this stuff is merged on the Kaldi side. Just wanted to get your thoughts, first, and see if this was on your radar.

dpny518 commented 5 years ago

Does anyone know performance improvement with this? https://medium.com/voicetube/build-gentle-w-cuda-enabled-kaldi-cb9eac86afc3

strob commented 5 years ago

I don’t believe there are significant performance improvements decode-side but would love to see a comparison!

On Sep 3, 2019, at 12:17 AM, Steve Rogers notifications@github.com wrote:

Does anyone know performance improvement with this? https://medium.com/voicetube/build-gentle-w-cuda-enabled-kaldi-cb9eac86afc3 https://medium.com/voicetube/build-gentle-w-cuda-enabled-kaldi-cb9eac86afc3 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lowerquality/gentle/issues/148?email_source=notifications&email_token=AABS5JYQVF22NKHMEP5RXITQHYFPHA5CNFSM4EO6QXOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5XIG6A#issuecomment-527336312, or mute the thread https://github.com/notifications/unsubscribe-auth/AABS5J4XRIYY6AXDEV624KDQHYFPHANCNFSM4EO6QXOA.

oerdem19 commented 4 years ago

I have done all these but it seems that GPU is still not being utilized.

trias702 commented 3 years ago

I have successfully compiled the Gentle k3 binary against Kaldi master branch with CUDA. Unfortunately, I agree with oerdem19 that the GPU is not actually being utilised by k3. The GPU VRAM is being used, so some kind of payload is going to the GPU, but no actual calcuations are happening on the GPU.

Looking at the k3.cc source, it makes sense that nothing is happening because the CuDevice object is never used:

CuDevice &cu_device = CuDevice::Instantiate();
cu_device.SetVerbose(true);
cu_device.SelectGpuId("yes");

However, if you look at the rest of the k3 code, that cu_device object is never used anywhere, which I think is the problem. It should be passed to some decoding function but it isn't.

Has anyone discovered any way of potentially getting the decoding to happen on GPU?

Lorenzoncina commented 2 years ago

I have successfully compiled the Gentle k3 binary against Kaldi master branch with CUDA. Unfortunately, I agree with oerdem19 that the GPU is not actually being utilised by k3. The GPU VRAM is being used, so some kind of payload is going to the GPU, but no actual calcuations are happening on the GPU.

Looking at the k3.cc source, it makes sense that nothing is happening because the CuDevice object is never used:

CuDevice &cu_device = CuDevice::Instantiate();
cu_device.SetVerbose(true);
cu_device.SelectGpuId("yes");

However, if you look at the rest of the k3 code, that cu_device object is never used anywhere, which I think is the problem. It should be passed to some decoding function but it isn't.

Has anyone discovered any way of potentially getting the decoding to happen on GPU?

May I ask you how did you compile k3.cc with cuda enabled? I'm trying to use kaldi 5.5 with cuda 11 , the process seems fine but at then end the aligment doesn't work. I'm followinf all the gentle instructions for intallation, so basically the install.sh script.

In order to enable cuda , I'm using this comand for configuration of kaldi.mk in the install_kaldi.sh: . /configure --static --static-math=yes --static-fst=yes --cudatk-dir=/usr/local/cuda-11.0 --openblas-root=../tools/OpenBLAS/install