hirofumi0810 / asr_preprocessing

Python implementation of pre-processing for End-to-End speech recognition
MIT License
69 stars 23 forks source link

Running out of ram #1

Open jesuistay opened 6 years ago

jesuistay commented 6 years ago

I couldnt get HTK to work properly, possibly due to bad installation. But seamed to work fine with librosa.

However when it comes to '===> Reading audio files... it seams like the for loop going over the audio paths just fills up my 8gb of ram and swap. And this is only on the 28539 files from train-clean-100. And it doesn't produce any files at this stage. Is there a trick I am missing to get the preprocessor going without reading everything into ram all at once? Eta was over 1 hour and it broke down after 54% of the train-clean-100 dataset.

hirofumi0810 commented 6 years ago

Hi, @jesuistay

Only 1 file will be loaded In each loop for the memory efficiency, so I don't know why. Which loop do you mean? There are 3 loops in librispeech/inputs/input_data.py.

jesuistay commented 6 years ago

First one: for i, audio_path in enumerate(tqdm(audio_paths)): To me it looks like it traverses the entire dataset and creating the dict, in order to calculate the mean std I assume.

For now I've just skipped the all the normalization and just writing the npy files after I get the input_data_utt from librosa.

jesuistay commented 6 years ago

I managed to get HTK working, but the ram problem still confuses me. I had to increase my swap partition to 16 gb (+ 8gb of ram) just to manage and preprocess the clean-100