I really appreciate your excellent work! I tried to train on NExT-QA, but first I found that it doesn't have qns _bert file for the test.
And one more thing, when I shift to NExT-QA and changed some parameters like max_qa_length, bbox_num. But it ran out a bug during training:
Traceback (most recent call last):
File "D:/code/pycharm/Nextqa/HQGA/main_qa.py", line 94, in <module>
main(args)
File "D:/code/pycharm/Nextqa/HQGA/main_qa.py", line 59, in main
vqa.run(f'{model_type}-{model_prefix}-22-39.88.ckpt', pre_trained=False)
File "D:\code\pycharm\Nextqa\HQGA\videoqa.py", line 97, in run
train_loss, train_acc = self.train(epoch)
File "D:\code\pycharm\Nextqa\HQGA\videoqa.py", line 121, in train
out, prediction, _ = self.model(video_inputs, qas_inputs, qas_lengths, temp_input)
File "C:\Users\17965\anaconda3\envs\hqga\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "D:\code\pycharm\Nextqa\HQGA\networks\VQAModel\HQGA.py", line 102, in forward
vid_feats = self.vid_encoder(vid_feats)
File "C:\Users\17965\anaconda3\envs\hqga\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "D:\code\pycharm\Nextqa\HQGA\networks\Encoder\EncoderVid.py", line 77, in forward
bbox_features = self.tohid(bbox_features)
File "C:\Users\17965\anaconda3\envs\hqga\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "C:\Users\17965\anaconda3\envs\hqga\lib\site-packages\torch\nn\modules\container.py", line 117, in forward
input = module(input)
File "C:\Users\17965\anaconda3\envs\hqga\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "C:\Users\17965\anaconda3\envs\hqga\lib\site-packages\torch\nn\modules\linear.py", line 93, in forward
return F.linear(input, self.weight, self.bias)
File "C:\Users\17965\anaconda3\envs\hqga\lib\site-packages\torch\nn\functional.py", line 1692, in linear
output = input.matmul(weight.t())
RuntimeError: mat1 dim 1 must match mat2 dim 0
I debug it, it seems that the shape of video features is different from msvd, should I change the parameter in videoqa.py like:
feat_dim = 2048
bbox_dim = 5
num_clip, num_frame, num_bbox = 8, 8*4, 10 # For msvd
feat_hidden, pos_hidden = 256, 128
word_dim = 300
vocab_size = None if self.use_bert else len(self.vocab)
num_class = 1 if self.multi_choice else 1853 #4001 for msrvtt, 1853 for msvd, 1541 for frameQA in TGIF-QA
Hi,
I really appreciate your excellent work! I tried to train on NExT-QA, but first I found that it doesn't have qns _bert file for the test.
And one more thing, when I shift to NExT-QA and changed some parameters like max_qa_length, bbox_num. But it ran out a bug during training:
I debug it, it seems that the shape of video features is different from msvd, should I change the parameter in
videoqa.py
like: