mpc001 / end-to-end-lipreading

Pytorch code for End-to-End Audiovisual Speech Recognition
174 stars 50 forks source link

shape '[-1, 29, 512]' is invalid for input of size 497664 #32

Open jkamb1 opened 3 years ago

jkamb1 commented 3 years ago

Hello there, After creating the files via the convert_video.py I try to run the audio-only main.py and get the following issue. It seems to be something wrong with the dimentions but I don't know how to fix it. Any ideas would be highly appreciated Statistics: train: 488766, val: 25000, test: 25000

Epoch 0/29 Current Learning rate: [0.0003] Traceback (most recent call last): File "/home/Documents/audiovisual/audio_only/main.py", line 256, in main() File "/home/Documents/audiovisual/audio_only/main.py", line 252, in main test_adam(args, use_gpu) File "/home/Documents/audiovisual/audio_only/main.py", line 230, in test_adam model = train_test(model, dset_loaders, criterion, epoch, 'train', optimizer, args, logger, use_gpu, save_path) File "/home/evialv/Documents/audiovisual/audio_only/main.py", line 146, in train_test outputs = model(inputs) File "/home/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/home/Documents/audiovisual/audio_only/model.py", line 156, in forward x = x.view(-1, self.frameLen, self.inputDim) RuntimeError: shape '[-1, 29, 512]' is invalid for input of size 497664