configure for frozen fine-tuning

TengdaHan / MemDPC

[ECCV'20 Spotlight] Memory-augmented Dense Predictive Coding for Video Representation Learning. Tengda Han, Weidi Xie, Andrew Zisserman.

Apache License 2.0

164 stars 20 forks source link

configure for frozen fine-tuning #5

Closed YuqiHUO closed 4 years ago

YuqiHUO commented 4 years ago

hi, thanks for the great work. I just wonder if you can provide the configure of frozen training you mentioned in table 2. How many epochs and what lr did you use? Thank you!

chenbiaolong commented 4 years ago

@TengdaHan I would like to know the detail configuration to reproduce the accuracy on ucf101 dataset, any plan to release the train scrips?

TengdaHan commented 4 years ago

Sorry for the delay, I just updated some evaluation code for your reference. See here https://github.com/TengdaHan/MemDPC/tree/master/eval

Let me know if there are any problems.

YuqiHUO commented 4 years ago

Why do you use dropout = 0.9 in MemDPC but 0.5 in your previous work DPC? I found that most video self-supervised work use dropout=0.5, did you find something interesting?

TengdaHan commented 4 years ago

I found dropout = 0.9 for finetuning entire network gives much better classification accuracy (experimented dropout = 0.5, 0.7, 0.9). We re-evaluate DPC weights under the exact same setting in MemDPC paper Table 1, and it also gets better results.

Just curious, can you let me know which work(s) uses dropout = 0.5? I found many papers didn't mention this detail.

YuqiHUO commented 4 years ago

thanks, https://arxiv.org/pdf/2003.02692 (PSP) https://arxiv.org/pdf/2008.03800 (CVRL), CMC, VCOP used dropout = 0.5. I have tried 0.5 vs 0.9 in my code, finding that 0.5 performs way much better than 0.9. So I'm very curious about your config. Maybe is because you use final_bn and other methods (including mine) didn't use.

TengdaHan commented 4 years ago

On my side, finetuning without the final_bn still works better with 0.9 dropout. I think it's because UCF101 and HMDB51 are too small and very easy to overfit the training set. The effect of large dropout will show up when you train the classifier for longer. Thank you for letting me know.