GuitarML / GuitarLSTM

Deep learning models for guitar amp/pedal emulation using LSTM with Keras.
https://guitarml.com/
GNU General Public License v3.0
367 stars 49 forks source link

Memory leak on predictions #6

Closed mishushakov closed 1 year ago

mishushakov commented 3 years ago

I have noticed that in order to properly finish training you'll need a lot of free memory in order to run the prediction if you try to save the tensor as file the resulting file is going to take up gigabytes

in my case (#3) you're basically getting 65mb of data for 242kb of audio (26346% increase)

GuitarML commented 3 years ago

Yes, it's doing the same preprocessing that it does before training. Each predicted sample is determined by the previous input_size number of samples. So for a model with input size 100, and a wav file of 44100 samples (1 second) it creates an array of size (44100, 100, ) but 99% of all those samples are redundant, so there's definitely a better way of handling that. I started a custom dataLoader class which takes a small batch of data, preprocesses/trains it, then frees up the memory for the next batch. I'm having trouble with it training properly though, so I'll share that in case someone wants to try to fix it. The split_data param is basically a work around because I couldn't get that class working yet.

mishushakov commented 3 years ago

Hey there & happy holidays! thanks for the explanation

maybe using the tensorflow audio preparation module would resolve the problem? https://www.tensorflow.org/io/tutorials/audio

GuitarML commented 3 years ago

Happy Holidays! That does look helpful, I can probably use something in there. I don't see anything in particular that solves this data preparation problem though, I think the solution is still getting the custom data loader to work. In the meantime, the split data param will allow for training with limited RAM.

I've made some progress on a plugin for the LSTM models, I'm really excited for what can be done with that. Lots of good things coming for 2021!

mishushakov commented 3 years ago

thanks for sharing! i'd love to help out (where i can) after holidays

this particular part caught my attention:

The content of the audio clip will only be read as needed, either by converting AudioIOTensor to Tensor through to_tensor(), or though slicing. Slicing is especially useful when only a small portion of a large audio clip is needed

as far as i understand the data will be lazy loaded, however i wasn't entirely sure, if this is what we need

on the side note Google's tone transfer looks very promising: https://sites.research.google/tonetransfer/ i believe they use the technology to create voices for their Google Assistant

thanks and lets hope 2021 will be nothing like 2020!

GuitarML commented 3 years ago

Update: The colab notebook has been updated to fix the Out of Memory issue by using a Sequence class for loading the data one batch at a time. It also uses MSE for the loss calculation to alleviate issues with the error-to-signal with pre emphasis filter. Conducting more tests on choice of loss function before rolling out this change to the python scripts.