Finetuning UCF101 head on pretrained kinetics400 rgb model

sebastianlutter commented 3 years ago

I used the pretrained RGB stream kinetics400 base model you kindly provided and did a fine-tune training and think my results are not as expected. The project code runs without issues. The only change I did to your code is the numpy flattening issue from https://github.com/TengdaHan/MemDPC/issues/11.

Would be great if you can provide your opinion on what I did and the results I got.

Used python3.7:

torch==1.4.0
tensorboardX==2.2
opencv-python==4.5.1.48
joblib==1.0.1
tqdm==4.59.0
matplotlib==3.3.4
torchvision==0.5.0
pandas==1.1.5
numpy==1.19.5
opencv_contrib_python==4.5.1.48

Downloaded and placed UCF101
executed process_data/src/extract_ff.py to extract frames (commented-out flow features code, just RGB frames)
executed process_data/src/write_csv.py to generate csv files (clip path and frames count)
Put file process_data/data/ucf101/ClassInd.txt with sorted list of the 101 labels
fine tune (with batch_size = 8 because of only 11GB GPU memory available)

python3.7 test.py --gpu 0 --net resnet34 --dataset ucf101 --batch_size 8 --img_dim 224 --epochs 500 --train_what ft --pretrain ../pretrained/k400-rgb-224_resnet34_memdpc.pth.tar

[ . . . ]

Epoch: [499][855/856]   Loss 0.7534 (0.6072)    Acc: 0.7500 (0.8000)    T-data:0.00 T-batch:1.13
Epoch: [499]    T-epoch:973.63
100%|██████████████████████████████████████████████████████████████████████████████████| 99/99 [00:42<00:00,  2.32it/s]
Loss 1.2451     Acc: 0.6982
Training from ep 0 to ep 500 finished

run test set evaluation on best model of fine tune training

MODEL="log_tmp/ucf101-224-sp1_resnet34_lc_bs8_lr0.001_wd0.001_ds3_seq8_len5_dp0.9_train-ft_pt=..-pretrained-k400-rgb-224_resnet34_memdpc.pth.tar/model/model_best_epoch494.pth.tar"
python3.7 test.py --gpu 0 --net resnet34 --dataset ucf101 --center_crop --img_dim 224 --test "${MODEL}"

[ . . . ]

=> loaded testing checkpoint 'log_tmp/ucf101-224-sp1_resnet34_lc_bs8_lr0.001_wd0.001_ds3_seq8_len5_dp0.9_train-ft_pt=..-pretrained-k400-rgb-224_resnet34_memdpc.pth.tar/model/model_best_epoch494.pth.tar' (epoch 494)

100%|████████████████████████████████████████████████████████████████████████████| 2658/2658 [00:00<00:00, 3867.80it/s]
Mean: Acc@1: 0.4105 Acc@5: 0.7129

From the results in your paper I expected something like 0.70 Acc@1, but I got 0.41. I'm unsure if the Top1 accuracy mentioned in Table 2 in your paper (MemDPC, K400 (28d), Res. 224 , Arch. R-2D3D, depth 33, Modality V --> UCF 78.1%) has been trained with RGB or flow features. But results of my training should be at least as good as the C2 variant in Table 1 (full training with UCF101 on 128 img dims using RGB input and memory size 1024 --> Top1 of 68.2).

What is the expected accuracy for the training I did?

TengdaHan commented 3 years ago

Hmm, very strange.

I can see in the training output, the validation top1 accuracy of 500-th epoch is 69.82% -- which means the model is alright, at least not 41% accuracy in testing mode. Sorry I don't have time to investigate your model in detail, but I suggest checking the code in testing mode if there is any discrepancy with the validation mode?
We apply testing-time augmentation, and we report ten-crop accuracy, try adding ... --text {your_model} --ten_crop when testing. Usually for performance: 10-crop > 5-crop > center-crop. Thus I will expect your model have about 74%-77% accuracy with five-crop or ten-crop.
In the finetune training did you reduce the lr? try adding --schedule 300 400 in the training command. My bad, I didn't set it as default.
Our Table2 UCF 78.1% is RGB-model only (model is trained with RGB and inference with RGB only), the UCF 86.1% is the two-stream result.

sebastianlutter commented 3 years ago

Thank you for the clarifications. I'll check all that, run again and will report what my findings are

TengdaHan commented 3 years ago

OK. Re-open if have more questions.

movingsheep commented 2 years ago

Thank you for the clarifications. I'll check all that, run again and will report what my findings are Hi Sebastian, I meet the same issue. I got 0.45 on ten-crop result. Did you fix the problem? What did you do?

sebastianlutter commented 2 years ago

No, I gave up at some point

movingsheep commented 2 years ago

Got it. That is strange. I used the same environment to run DPC and it works. Not know what is wrong.

TengdaHan / MemDPC

Finetuning UCF101 head on pretrained kinetics400 rgb model #13