Open fantasyRqg opened 2 years ago
Hi fantasyRgg, and thanks for your PR 😃
Just for context, so I understand the problem you're proposing to solve, I want to ask some questions:
Ideally, a good solution would work well in all kinds of combinations of answers to those questions
How large is your background noise dataset?
About 2k records
If you are training a model, how many workers do you use for preparing the audio examples that go into the training batches?
Only one worker, I tried multi worker, not fast enough.
How much memory (RAM) is there on the computer where you are doing the training?
I cached samples and noises. samples took 7GB, noiese took 1.5GB
What audio file format are your background noise files? And do they have the same sample rate as the "clean" input audios that the noises get added to?
I don't think audio format and sample rate is problem. audio: Audio
paramter will take care of all problem.
Are you using an SSD or a HDD?
HDD
Thanks for the insight :) Indeed, in your case it makes sense to apply caching like this.
My own use case is quite different, and would actually be best without caching:
I don't think audio format and sample rate is problem. audio: Audio paramter will take care of all problem.
The reason why I asked is that resampling (in case of mismatch) may take a significant amount of CPU time, slowing down the model training.
I'm currently wrapping up the 0.11 release, and then I'll have some work preparing a few new transforms, and then after that I'll hopefully have more time to consider this caching feature. In the meantime, thanks for your patience, and I hope you're okay with using your own fork for now
Boost background_noise performance.