facebookresearch / WavAugment

A library for speech data augmentation in time-domain
MIT License
640 stars 57 forks source link

Leaking memory during training #8

Closed AdelNabli closed 4 years ago

AdelNabli commented 4 years ago

Hi !

I installed WavAugment and want to use it to transform audio files on-the-fly in my dataloader. I noticed no problem when using augment on single files, however, when starting to train a neural network using the WavAugment based dataloader, I noticed that my RAM usage kept increasing after each new batch of data, until my job is killed for cause of memory usage. I do not notice such behaviour when I use the sox wrapper furnished with torchaudio to augment my data. I use pytorch 1.3.1 and torchaudio 0.3.2.

Do you know what would cause this ? Thank you very much, Adel

eugene-kharitonov commented 4 years ago

Hi, thanks for flagging!

Could you please a snippet of how you call WavAugment, this would be really helpful to reproduce.

AdelNabli commented 4 years ago

Hi !

While trying to reproduce the behaviour I noticed before, I came to the realisation that the problem wasn't caused by WaveAugment but by the poor way I chose to transform my data while I was prototyping: I coded a "transform" function that initialized a new MelSpectrogramm each time I passed an item from my Dataset to my transform pipeline. This is what caused the leaking memory, by initializing the MelSpectrogramm only once for all future transforms, it works perfectly with WavAugment. Sorry for the false alert, it was completely my fault !

eugene-kharitonov commented 4 years ago

Awesome that it got solved and thanks for giving a try to WavAugment!