asteroid-team / torch-audiomentations

Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.
MIT License
928 stars 87 forks source link

cache background_noise rms data #145

Open fantasyRqg opened 2 years ago

fantasyRqg commented 2 years ago

Boost background_noise performance.

  1. Reduce audio decode and file io
  2. Reduce rms compute. maybe a diffrenece between rms(partial audio) and rms(full audio)
iver56 commented 2 years ago

Hi fantasyRgg, and thanks for your PR 😃

Just for context, so I understand the problem you're proposing to solve, I want to ask some questions:

Ideally, a good solution would work well in all kinds of combinations of answers to those questions

fantasyRqg commented 2 years ago
iver56 commented 2 years ago

Thanks for the insight :) Indeed, in your case it makes sense to apply caching like this.

My own use case is quite different, and would actually be best without caching:

I don't think audio format and sample rate is problem. audio: Audio paramter will take care of all problem.

The reason why I asked is that resampling (in case of mismatch) may take a significant amount of CPU time, slowing down the model training.

I'm currently wrapping up the 0.11 release, and then I'll have some work preparing a few new transforms, and then after that I'll hopefully have more time to consider this caching feature. In the meantime, thanks for your patience, and I hope you're okay with using your own fork for now