asteroid-team / torch-audiomentations

Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.
MIT License
969 stars 88 forks source link

Microphone style transfer (MicAugment) #88

Open iver56 opened 3 years ago

iver56 commented 3 years ago

À la https://arxiv.org/abs/2010.09658

akashrajkn commented 2 years ago

I am curious - how do you plan on using micaugment? Is it the idea that we can submit a trained micaugment model as part of the transform?

iver56 commented 2 years ago

I haven't thought it through, but yeah, we should ideally have a pretrained model that is ready to be used. The model can be uploaded as a binary in a github release, and can be downloaded and stored locally on demand (the first time the transform gets used). This approach is inspired by the way Keras did pretrained imagenet models.

Would https://github.com/akashrajkn/micaugment be suitable?

akashrajkn commented 2 years ago

I think it is suitable - however, I still have to update the repo with a pretrained model.

iver56 commented 2 years ago

It would be awesome if you could make that happen 🤩 But I guess the pretrained model would depend on a specific sample rate, right? Ideally, torch-audiomentations should be compatible with a wide range of sample rates 🤔 Maybe it could do some resampling to match the sample rate used in the model