asteroid-team / torch-audiomentations

Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.
MIT License
969 stars 88 forks source link

3-dimensional tensor #87

Closed zhaoyun-ai closed 3 years ago

zhaoyun-ai commented 3 years ago

I want to know what it means when I get 3-dimensional tensor from the transform.What does each dimension mean

iver56 commented 3 years ago

The tensor has a shape like (batch_size, num_channels, num_samples)

Batched computation allows one to apply audio augmentation to multiple audio recordings in one pass, and results in faster execution due to parallelism.

Multiple channels are allowed for doing e.g. stereo or more instead of just mono.

If you have a tensor that represents 4 stereo audio snippets of 2 seconds each at 16000 hz, the shape of the tensor would be (4, 2, 32000)

zhaoyun-ai commented 3 years ago

Thank you for your early reply!I will close this question.