Why is the shape of global output different during training and evaluation for the same input video?
I have passed the same vid input to train.py and test.py, but the global output shape differs for both of these cases.
_vid_id : 01KML
global_output (train): torch.Size([38, 1936])
globaloutput (test): torch.Size([272, 1936])
I am unable to understand this requirement. Can you please help me with this?
Sorry for the late reply (the GitHub did not forward the issue to my email). We do some downsampling during the training time while the testing is conducted over the full video sequences.
Why is the shape of global output different during training and evaluation for the same input video? I have passed the same vid input to train.py and test.py, but the global output shape differs for both of these cases. _vid_id : 01KML global_output (train): torch.Size([38, 1936]) globaloutput (test): torch.Size([272, 1936]) I am unable to understand this requirement. Can you please help me with this?