axis-value for MaskSpectrogramTime (axis=1) and MaskSpectrogramFrequency (axis=0) need to be shifted by +1 (because of the channel dimension of the signal).
Why?
audtorch.datasets.utils.load returns shape:
`**numpy.ndarray**: two-dimensional array with shape
`(channels, samples)`
A spectrogram with
spec = Spectrogram(320, 160)(signal)
has shape (C, F, S)
Thus: Both axis values needed to be increased by 1.
Steps to reproduce
from audtorch.datasets import LibriSpeech
from audtorch.transforms import Compose, Spectrogram, MaskSpectrogramFrequency
root = '' # TODO
data = LibriSpeech(root=root, sets='dev-clean', transform=Compose([Spectrogram(320, 160), MaskSpectrogramFrequency(0.1)]))
data[0][0]
# error
Bug
axis
-value forMaskSpectrogramTime
(axis=1
) andMaskSpectrogramFrequency
(axis=0
) need to be shifted by+1
(because of the channel dimension of the signal).Why?
audtorch.datasets.utils.load
returns shape:A spectrogram with
has shape
(C, F, S)
Thus: Both
axis
values needed to be increased by 1.Steps to reproduce