Hon-Wong / PTSEFormer

[ECCV 2022] PTSEFormer: Progressive Temporal-Spatial Enhanced TransFormer Towards Video Object Detection
https://arxiv.org/abs/2209.02242
MIT License
32 stars 7 forks source link

Input shape RuntimeError #12

Open iamamiramine opened 9 months ago

iamamiramine commented 9 months ago

I am getting the following error when i'm trying to train RuntimeError: shape '[1, 125584, 32]' is invalid for input of size 8037376

It is coming from this method tgt2 = self.multihead_attn(query=self.with_pos_embed(tgt, query_pos), key=self.with_pos_embed(memory, pos), value=memory, attn_mask=memory_mask, key_padding_mask=memory_key_padding_mask)[0] in src/models/transformer/deformable_transformer.py

Any idea why?

cxylotus commented 6 months ago

I have met the similar error with you, after I modified the get_ref_imgs function in vid_multi.py, the error is gone. In my view, it's the _self.frame_seglen[idx] out of idex, I hope this can help you.