Kaldi Memory Leak - Githubissues

kaldi-asr / kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.

http://kaldi-asr.org

Other

14.32k stars 5.33k forks source link

Kaldi Memory Leak #4723

Open TienVTK opened 2 years ago

TienVTK commented 2 years ago

Hi all

I have an issue about memory leak when I run kaldi code on RTX 3090 Memory didn't free when all processs done (done decode kaldi). No code modification with my experiment. Demo source: https://github.com/kaldi-asr/kaldi/blob/master/src/cudadecoderbin/batched-wav-nnet3-cuda-online.cc

Any idea about this problem?

danpovey commented 2 years ago

That's not really how CUDA allocation works.. you don't need to free things manually; when the process dies it is automatically freed. If it were not all freed, it would be a bug in the CUDA drivers.

TienVTK commented 2 years ago

I don't think it's good idea. My problem is memory leak (RAM). I understand clearly that memory is freed when process die. In my case, we build kaldi like as a server, we need a long-live process from server side where serv many request from users. And my expectation is that kaldi could free memory (RAM) when request is done.

danpovey commented 2 years ago

If memory usage is constantly growing, that is a problem and should be addressed. If not all things are freed at program exit, but they don't grow in an unlimited way, that doesn't really affect users so it wouldn't be a priority to fix. You'd need to be more specific about the problem.

TienVTK commented 2 years ago

Here is all of my steps

Run Demo (with a little code modification to keep demo alive)
Push 100 request concurrence to demo
Check RAM by 'htop' linux ==> RAM increase significantly 3.1. Back to step 2 and loop
After 4 times retry (step 2). Out of memory happened I think it is memory leak

I try to reduce memory leak by reset corr_id (corr_id = corr_id % MAX_CORR_ID) RAM usage increase but it is in an limited way.

I want to ask "the way to fix mem-leak like that makes other errors?" or other suggestion to fix this problem?

Many thanks,

nshmyrev commented 2 years ago

We use online code in a server, you can try with a docker:

https://hub.docker.com/repository/docker/alphacep/kaldi-vosk-server-gpu

it is stable, no leaks. There should be leaks somewhere else. It is not easy to use the code properly though.

TienVTK commented 2 years ago

Could you share vosk source? We need to natively build and deploy because of our feature.

Many thanks.

Vào 17:52, T.5, 14 Th4, 2022 Nickolay V. Shmyrev @.***> đã viết:

We use online code in a server, you can try with a docker:

https://hub.docker.com/repository/docker/alphacep/kaldi-vosk-server-gpu

it is stable, no leaks. There should be leaks somewhere else. It is not easy to use the code properly though.

— Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/issues/4723#issuecomment-1099062550, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADUSF7WZ27SZIMV5J446DL3VE72FRANCNFSM5S32S6XQ . You are receiving this because you authored the thread.Message ID: @.***>

nshmyrev commented 2 years ago

Its all on github

https://github.com/alphacep/vosk-api/blob/master/src/batch_recognizer.cc https://github.com/alphacep/vosk-api/blob/master/src/batch_model.cc

stale[bot] commented 2 years ago

This issue has been automatically marked as stale by a bot solely because it has not had recent activity. Please add any comment (simply 'ping' is enough) to prevent the issue from being closed for 60 more days if you believe it should be kept open.