Closed RERV closed 3 years ago
Hi! Sorry for the late reply. I still recommend you finetune on your own version of dataset/settings because a tiny difference could reduce the final percentage and it's hard to debug. But if you really need one, here it is: http://www.robots.ox.ac.uk/~htd/memdpc/ft_ucf101_224_resnet34_memdpc.pth.tar
@TengdaHan did you filter out too short videos in test mode? what's the accuracy of the checkpoint(ft_ucf101_224_resnet34_memdpc.pth.tar) you provided?I just refactor your code and want to test if my implementation is right
This exact checkpoint gets:
CenterCrop: Acc@1: 0.7673 Acc@5: 0.9280
FiveCrop: Acc@1: 0.7801 Acc@5: 0.9323
TenCrop: Acc@1: 0.7811 Acc@5: 0.9366
It's also the 78.1% reported in paper Table 2. Small variation is possible after you refactor the code.
I didn't filter out too short videos, I pad short videos by their last frame, up to the required length. -- but this actually won't affect much on the results.
@TengdaHan thank you for your reply, I found a tiny difference in model_lc.py and model.py, which one is the right version? The implementation of train and eval are not equivalent
I modify the 2D3D-ResNet backbone such that the output feature is before final ReLU: https://github.com/TengdaHan/MemDPC/blob/11f03299496c55d3ecae670752e958d8ce0c80fb/backbone/resnet_2d3d.py#L251
It's because during pretraining, we will contrast predicted features (with a scale of (-inf, inf)
) with ground-truth features. Removing final ReLU is to keep the ground-truth feature also has a scale of (-inf, inf)
.
Although others (MoCo, SimCLR, etc.) can use a prediction head to avoid this scaling issue -- we didn't use prediction head for the ground-truth features in this paper.
In the evaluation stage for the action classification task, we add ReLU back, followed by a linear layer. Nothing special.
Let me know if unclear.
Hi, I notice that you release two pretrained weight. However, it`s still need a long time to finetune. Could you release the final trained model weight so we can easy test on it? Thanks!