Closed RomanScott closed 5 years ago
dev answer: torchfilter
is a work-in-progress branch. Please only report issues on the master branch.
researcher answer: chunking can be implemented but would decrease performance for the BLSTM model. Instead you would be better off to retrain a model using --unidirectional
which offers realtime capabilites.
please note that I don't think there is a way that it will work on full tracks with a 6GB GPU. I am testing with 12GB and it's all right (up to roughly 6-7mn). Although we are doing our best to optimize memory usage, these complex double spectrograms do take a lot of RAM and there's nothing to do about it.
Unless you do learn an online model with a unidirectional LSTM as suggested by @faroit
Does a unidirectional model reduce performance?
Does a unidirectional model reduce performance?
yes, for vocals it might be up to 0.5 dB SDR. For drums or bass its not that important, though.
Does a unidirectional model reduce performance?
yes, for vocals it might be up to 0.5 dB SDR. For drums or bass its not that important, though.
I see, I'm mainly working with vocals and the master branch works even w/o cuda so no problem so far.
🐛 Bug
Hello,
I am trying to test out the torchfilters branch of this project. It works fine on shorter audio clips, but when the audio file is around 4 to 5 minutes in length, the program crashes with a CudaOutOfMemoryError.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
The program should finish execution on files of longer length as well. Is there a way to split the audio every one or two minutes, or use an audio loader in such a way that the entire song isn't loaded into CUDA memory at once, so that way it doesn't crash?
Thank you!
Environment
Please add some information about your environment
Additional context