pykaldi / pykaldi

A Python wrapper for Kaldi
https://pykaldi.github.io
Apache License 2.0
993 stars 247 forks source link

kaldi.fstext.read_fst_kaldi memory leak? #201

Open heibaidaolx123 opened 4 years ago

heibaidaolx123 commented 4 years ago

My pykaidl is installed according to the steps in Dockerfile. The code to reproduce the leak:

import kaldi.fstext as fst file_pos = 'train/lats/lat.1.ark:37402' fst.read_fst_kaldi(file_pos)

Part of output of valgrind --leak-check=full:

==45916== 70,171 (24 direct, 70,147 indirect) bytes in 1 blocks are definitely lost in loss record 1,122 of 1,128 ==45916== at 0x4C2A4E3: operator new(unsigned long) (vg_replace_malloc.c:344) ==45916== by 0x2567DD03: fst::VectorFst<fst::ArcTpl<fst::LatticeWeightTpl >, fst::VectorState<fst::ArcTpl<fst::LatticeWeightTpl >, std::allocator<fst::ArcTpl<fst::LatticeWeightTpl > > > >::Read(std::istream&, fst::FstReadOptions const&) (vector-fst.h:475) ==45916== by 0x2565C618: kaldi_fstext_vectorfst_clifwrap::pyLatticeVectorFst::wrapRead_as__read_from_stream(_object, _object, _object*) (vector-fst-clifwrap.cc:3195) ==45916== by 0x219C53: _PyCFunction_FastCallDict (methodobject.c:231) ==45916== by 0x2A1ABB: call_function (ceval.c:4851) ==45916== by 0x2C4759: _PyEval_EvalFrameDefault (ceval.c:3335) ==45916== by 0x29BC5A: _PyFunction_FastCall (ceval.c:4933) ==45916== by 0x29BC5A: fast_function (ceval.c:4968) ==45916== by 0x2A1B94: call_function (ceval.c:4872) ==45916== by 0x2C4759: _PyEval_EvalFrameDefault (ceval.c:3335) ==45916== by 0x29BC5A: _PyFunction_FastCall (ceval.c:4933) ==45916== by 0x29BC5A: fast_function (ceval.c:4968) ==45916== by 0x2A1B94: call_function (ceval.c:4872) ==45916== by 0x2C4759: _PyEval_EvalFrameDefault (ceval.c:3335)

When I used pykaldi to read lattices in discriminative training, it would be killed due to OOM. Any idea why or how to fix it?

arkadyark commented 3 years ago

I am also struggling with this memory leak - trying to clean up an NnetLatticeFasterOnlineRecognizer instance but it seems like something is being left behind, and that memory seems to be created when read_fst_kaldi is called. Can anybody help with this?

@dogancan