facebookresearch / WavAugment

A library for speech data augmentation in time-domain
MIT License
640 stars 57 forks source link

Augmenting a stack of waveforms simultanously and independently? #13

Open mimayn opened 4 years ago

mimayn commented 4 years ago

Hi,

So all of you examples operate on a single waveform, but I'm curious what if I want to apply a randomized transform (say pitch) to a stack of waveforms without having to go through a for loop. for example instead of having two audio channels for a waveform I'd like to stack 100 copies of a same waveform into a input tensor (Size[N_ch, L]) and have each row transformed randomly but also independently from each other, so i'd have a tensor of same dimension as input and each raw is transformed with a different random pitch shift for example. From what I've tried so far, certain effects just concatenate all the rows into one single waveform and apply one single draw of the randomizer to the entire concatenated signal.

So is there anyway to skip the for loop?

eugene-kharitonov commented 4 years ago

Hey, thanks for raising the issue.

Parallel batch processing is something on our mind, bu currently there is no clear solution. In the current scenarios, we augment on data loading, thus parallelisation comes for free from pytorch data loaders.