Batch GPU Transforms - Githubissues

asteroid-team / torch-audiomentations

Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.

MIT License

969 stars 88 forks source link

Batch GPU Transforms #39

Closed mogwai closed 4 years ago

mogwai commented 4 years ago

The BaseWaveTransformations currently don't except a batch of audio in the forward as they think that a batch of audio is multi_channel audio. I can submit a PR or is there some design decision here that I'm missing?

iver56 commented 4 years ago

All input tensors are assumed to be three-dimensional at the moment. I.e. (batch_size, num_channels, num_samples). Did this help? Should it be better documented? I'm thinking of removing support for two-dimensional tensors btw. How do you feel about that? In that case, mono audio would be represented as (batch_size, 1, num_samples)

mogwai commented 4 years ago

Interesting, this isn't working though? Am I supposed to wrap the augmentations somewhere else?

import torchaudio
from torch_audiomentations.augmentations.gain import Gain
wave,sr = torchaudio.load("./tests/data/dev01.wav")
# Create a batch of 32
wave = wave[None].repeat(32,1,1)
print(wave.shape)
# [32,1,410001]
res = Gain()(wave, sr)

iver56 commented 4 years ago

Interesting. How does it fail? With an error message? Or is the output not as expected?

mogwai commented 4 years ago

My mistake, this is only the case for BackgroundNoise and AddImpulseResponse

iver56 commented 4 years ago

Yeah, I haven't really finished and released those two yet. I'll get to it

mogwai commented 4 years ago

Fair enough :)

mogwai commented 4 years ago

So I have to set support_multichannel = True to support batches?

iver56 commented 4 years ago

AddBackgroundNoise and AddImpulseResponse should support batches, but I don't think they support multichannel audio yet.

I think they expect 2D tensors like (batch_size, num_samples) for now

mogwai commented 4 years ago

If I was creating a new BaseWaveTransform, e.g. Reverb, I'd have to set support_multichannel = True to make it work?

iver56 commented 4 years ago

Yes. I might rework that later though