burchim / EfficientConformer

[ASRU 2021] Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition
https://arxiv.org/abs/2109.01163
Apache License 2.0
210 stars 32 forks source link

IndexError: tuple index out of range #20

Closed Silentssss closed 1 year ago

Silentssss commented 1 year ago

Hi!, That is my ERROR and my para, Can you tell me how to fix it? Iam very confused! python main.py --config_file configs/EfficientConformerCTCSmall.json --prepare_dataset --create_tokenizer

image

burchim commented 1 year ago

Hi,

TimeMasking's forward function now requires the batch dimension to process spectrograms.

Including the batch dimension should fix the issue: x[b:b+1, :, :x_len[b]] = torchaudio.transforms.TimeMasking(time_mask_param=T).forward(x[b:b+1, :, :x_len[b]])

instead of: x[b, :, :x_len[b]] = torchaudio.transforms.TimeMasking(time_mask_param=T).forward(x[b, :, :x_len[b]])

Silentssss commented 1 year ago

Hi,

TimeMasking's forward function now requires the batch dimension to process spectrograms.

Including the batch dimension should fix the issue: x[b:b+1, :, :x_len[b]] = torchaudio.transforms.TimeMasking(time_mask_param=T).forward(x[b:b+1, :, :x_len[b]])

instead of: x[b, :, :x_len[b]] = torchaudio.transforms.TimeMasking(time_mask_param=T).forward(x[b, :, :x_len[b]])

Thanks! It works for me! image