arunos728 / MotionSqueeze

Official PyTorch Implementation of MotionSqueeze, ECCV 2020
BSD 2-Clause "Simplified" License
139 stars 16 forks source link

Is the number of frames used in training and testing the same? #9

Closed liming-ai closed 2 years ago

liming-ai commented 3 years ago

Thanks for your contribution.

In your paper there are 8 or 16 frames used for training, but you didn't say how many frames used for testing.

So there is a question that the number of frames in the figure is used for training or used for testing? In other words, is the number of frames used for training and testing should be same?

image

arunos728 commented 3 years ago

Yes, we used the same number of frames for testing MSNet on Something V1&V2. Except for the last MSNet model (55.1% / 84.0% on Something V1), we only infer a single clip for testing. You can check the number of clips in Table 1.

liming-ai commented 3 years ago

Yes, we used the same number of frames for testing MSNet on Something V1&V2. Except for the last MSNet model (55.1% / 84.0% on Something V1), we only infer a single clip for testing. You can check the number of clips in Table 1.

Thanks for your reply, actually I cannot reproduce the accuracy as reported in your paper, the enviroment is same with you, but my resnet18 top1 acc is 44.5% on something-v1, I tried many times but still cannot achieve 46%. Could you please provide some advice? (I did not change anything)

arunos728 commented 3 years ago

What is the accuracy of your TSM baseline model? (Something-V1)

liming-ai commented 3 years ago

What is the accuracy of your TSM baseline model? (Something-V1)

44.5% on resnet18 and 48.9% on resnet50

arunos728 commented 3 years ago

I mean, the accuracy of the TSM baseline without the MS module. You can train the TSM baseline by modifying (flow_estimation = 0 at models.py line 64) If there is no accuracy gap between the TSM baseline and the MSNet, it could be the problem of 'Spatial Correlation Sampler'