TengdaHan / DPC

Video Representation Learning by Dense Predictive Coding. Tengda Han, Weidi Xie, Andrew Zisserman.
MIT License
251 stars 34 forks source link

Experimental details of Table 1? #11

Closed jiujing23333 closed 4 years ago

jiujing23333 commented 4 years ago

Hi, Tengda. I'm trying to reimplement your promising result using the small dataset UCF101. Can you provide the hyper-parameter setting of the Table 1 in your paper? Such as input size, training epochs, etc. Very thanks.

TengdaHan commented 4 years ago

Hi, Note at the beginning: this experiment is just for proof of concept, it's not the right way for self-supervised learning (pretrain & finetune on the same dataset). IMO, the right way is always to utilize more unlabelled data to show its quality.

We tried slightly different settings for multiple times and get similar performance. Here is just one example to get Table1 DPC result.

For DPC training on UCF101 (dpc/):

For UCF101 finetuning (eval/):

Good luck & have fun!

jiujing23333 commented 4 years ago

@TengdaHan Thanks for your kind reply. I just want to do a quick, simple experiment because the Kinetics 400 dataset is so time-consuming (as long as 1 week+ and 6 weeks). Do you think what is the bottleneck of the training time?

TengdaHan commented 4 years ago

Kinetics400 with 128x128 resolution, 3D-ResNet18, can get a good feature with less than 1 week (maybe 2-3 days). The bottleneck of the training time is always GPU runtime (GPU utility is 100% in my training), as long as you read video frames from SSD.

n-behrmann commented 4 years ago

Hi Tengda, thanks for sharing your code and thanks for the detailed reply to this issue, I found it very helpful! However, there is one thing I'm wondering about. I could reproduce 60.2% of the DPC method using the hyperparameters mentioned here, but when I ran the same finetuning code (same hyperparameters) with random initialization I got 54.4% instead of 46.5% mentioned in the paper. Is there something I might be missing? Thanks in advance!

TengdaHan commented 4 years ago

Hi, thanks for mention this. Your experimental setting and results are correct. By the time of submission, we didn't train the random initialization baseline long enough, partially because the 46.5% we obtained already matches previous random initialization baselines (e.g. Hara et al. 2018). Now we have realized this issue, and we will release a public benchmark very soon. Stay tuned :)

n-behrmann commented 4 years ago

Hi Tengda, thank you for the quick reply! That's very helpful :)