sooftware / conformer

[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)
Apache License 2.0
958 stars 175 forks source link

the Usage sample got an invalid size error #40

Open jingzhang0909 opened 2 years ago

jingzhang0909 commented 2 years ago

I have installed the package and after run the usage sample. there comes an error like following: RuntimeError: Gather got an input of invalid size: got [1, 3085, 8, 10], but expected [1, 3085, 9, 10]

Could you tell me where is my mistake?

sooftware commented 2 years ago

Did you use this code?

import torch
import torch.nn as nn
from conformer import Conformer

batch_size, sequence_length, dim = 3, 12345, 80

cuda = torch.cuda.is_available()  
device = torch.device('cuda' if cuda else 'cpu')

inputs = torch.rand(batch_size, sequence_length, dim).to(device)
input_lengths = torch.IntTensor([12345, 12300, 12000])
targets = torch.LongTensor([[1, 3, 3, 3, 3, 3, 4, 5, 6, 2],
                            [1, 3, 3, 3, 3, 3, 4, 5, 2, 0],
                            [1, 3, 3, 3, 3, 3, 4, 2, 0, 0]]).to(device)
target_lengths = torch.LongTensor([9, 8, 7])

model = nn.DataParallel(Conformer(num_classes=10, input_dim=dim, 
                                  encoder_dim=32, num_encoder_layers=3, 
                                  decoder_dim=32)).to(device)

# Forward propagate
outputs = model(inputs, input_lengths, targets, target_lengths)

# Recognize input speech
outputs = model.module.recognize(inputs, input_lengths)
jingzhang0909 commented 2 years ago

Did you use this code?

import torch
import torch.nn as nn
from conformer import Conformer

batch_size, sequence_length, dim = 3, 12345, 80

cuda = torch.cuda.is_available()  
device = torch.device('cuda' if cuda else 'cpu')

inputs = torch.rand(batch_size, sequence_length, dim).to(device)
input_lengths = torch.IntTensor([12345, 12300, 12000])
targets = torch.LongTensor([[1, 3, 3, 3, 3, 3, 4, 5, 6, 2],
                            [1, 3, 3, 3, 3, 3, 4, 5, 2, 0],
                            [1, 3, 3, 3, 3, 3, 4, 2, 0, 0]]).to(device)
target_lengths = torch.LongTensor([9, 8, 7])

model = nn.DataParallel(Conformer(num_classes=10, input_dim=dim, 
                                  encoder_dim=32, num_encoder_layers=3, 
                                  decoder_dim=32)).to(device)

# Forward propagate
outputs = model(inputs, input_lengths, targets, target_lengths)

# Recognize input speech
outputs = model.module.recognize(inputs, input_lengths)

Yes. Traceback (most recent call last): File "test_model.py", line 22, in outputs = model(inputs, input_lengths, targets, target_lengths) File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 577, in call result = self.forward(*input, *kwargs) File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 156, in forward return self.gather(outputs, self.output_device) File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 168, in gather return gather(outputs, output_device, dim=self.dim) File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/scatter_gather.py", line 68, in gather res = gather_map(outputs) File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/scatter_gather.py", line 55, in gather_map return Gather.apply(target_device, dim, outputs) File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/_functions.py", line 68, in forward return comm.gather(inputs, ctx.dim, ctx.target_device) File "/opt/conda/lib/python3.6/site-packages/torch/cuda/comm.py", line 165, in gather return torch._C._gather(tensors, dim, destination) RuntimeError: Gather got an input of invalid size: got [1, 3085, 8, 10], but expected [1, 3085, 9, 10]

sooftware commented 2 years ago

Do you want to try it without nn.DataParallel?

jingzhang0909 commented 2 years ago

Do you want to try it without nn.DataParallel?

thx! OK now. import torch import torch.nn as nn from conformer import Conformer

batch_size, sequence_length, dim = 3, 12345, 80

cuda = torch.cuda.is_available()
device = torch.cuda.set_device('cuda:0')

inputs = torch.rand(batch_size, sequence_length, dim).to(device) input_lengths = torch.IntTensor([12345, 12300, 12000]) targets = torch.LongTensor([[1, 3, 3, 3, 3, 3, 4, 5, 6, 2], [1, 3, 3, 3, 3, 3, 4, 5, 2, 0], [1, 3, 3, 3, 3, 3, 4, 2, 0, 0]]).to(device) target_lengths = torch.LongTensor([9, 8, 7])

model = Conformer(num_classes=10, input_dim=dim, encoder_dim=32, num_encoder_layers=3, decoder_dim=32).to(device)

Forward propagate

outputs = model(inputs, input_lengths, targets, target_lengths)

Recognize input speech

outputs = model.recognize(inputs, input_lengths) print(outputs)

sooftware commented 2 years ago

I'm operating normally. It's weird.