jtkim-kaist / VAD

Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.
842 stars 235 forks source link

Memory leak for larger audio files #7

Open vladfulgeanu opened 6 years ago

vladfulgeanu commented 6 years ago

This is on the py branch.

I tried to use the park.wav audio file from the recorded dataset:

audio_dir = './data/recorded_data/park.wav'

but when I run the test:

python3 main.py

during the MRCG extraction step, I quickly run out of all available RAM, even though I have 32GB.

After this the script exits with:

fftfilter 73.884435 Traceback (most recent call last): File "main.py", line 20, in result = utils.vad_func(audio_dir, mode, th, output_type, is_default) File "/home/VAD/lib/matlab_py/utils.py", line 170, in vad_func data_len, winlen, winstep = mrcg_extract(audio_dir) File "/home/VAD/lib/matlab_py/utils.py", line 142, in mrcg_extract mrcg_mat = np.transpose(mrcg.mrcg_features(noisy_speech, audio_sr)) File "/home/VAD/lib/matlab_py/mrcg.py", line 24, in mrcg_features cochlea1 = np.log10(cochleagram(g, int(sampFreq 0.025), int(sampFreq 0.010))) File "/home/VAD/lib/matlab_py/mrcg.py", line 198, in cochleagram rs = np.square(r) MemoryError

jtkim-kaist commented 6 years ago

Unfortunately, there is no automatic memory management function in this project. Therefore,

you should manually split the long wave file into some several short wave file of which the size can be held in your machine.

Thx!

Bonnerz commented 6 years ago

Unfortunately, there is no automatic memory management function in this project. Therefore,

you should manually split the long wave file into some several short wave file of which the size can be held in your machine.

Thx!

Do you have methods to split the long wave into short wave file such as the time is 10 seconds so that the short wave file is satisfied with the condition it is continuous?