MCG-NJU / TDN

[CVPR 2021] TDN: Temporal Difference Networks for Efficient Action Recognition
https://arxiv.org/abs/2012.10071
Apache License 2.0
371 stars 55 forks source link

Can't reproduce kinetic400 result #49

Closed sean186 closed 2 years ago

sean186 commented 2 years ago

Hello, I've trained kinetic400 with commend:

python -m torch.distributed.launch --master_port 12347 --nproc_per_node=5 \
        main.py  kinetics RGB --arch resnet50 --num_segments 8 --gd 20 --lr 0.02 \
        --lr_scheduler step  --lr_steps 20 45 60 --epochs 70 --batch-size 24 \
        --wd 1e-4 --dropout 0.5 --consensus_type=avg --eval-freq=1 -j 10 --npb 

and got prec@1:73.77 prec@5 91.1 . And I've used num_segments=16, and got Prec@1: 75.37878, which still can't reproduce kinetic400 result(76.6%、77.5%). There seems issue #33 got the same accuracy. Maybe the training data is inconsistent with yours, can you share your training data file or log? Or is there a problem with my training? Hope for your kindly reply. Thank you.

yztongzhan commented 2 years ago

Hi @sean186, our commend is:

python -m torch.distributed.launch --master_port 12347 --nproc_per_node=8 \
        main.py  kinetics RGB --arch resnet50 --num_segments 8 --gd 20 --lr 0.02 \
        --lr_scheduler step  --lr_steps 50 75 90 --epochs 100 --batch-size 16 \
        --wd 1e-4 --dropout 0.5 --consensus_type=avg --eval-freq=1 -j 4 --npb 

the lr_steps is 50 75 90 the totally epoch is 100. After training, we got 74.0% Top-1 accuracy under the validation scheme(one clip, center crop), and when perform the 30-view (3 crops, 10 clips) testing scheme, we finally got 76.6% Top-1 accuracy.

sean186 commented 2 years ago

Hi @sean186, our commend is:

python -m torch.distributed.launch --master_port 12347 --nproc_per_node=8 \
        main.py  kinetics RGB --arch resnet50 --num_segments 8 --gd 20 --lr 0.02 \
        --lr_scheduler step  --lr_steps 50 75 90 --epochs 100 --batch-size 16 \
        --wd 1e-4 --dropout 0.5 --consensus_type=avg --eval-freq=1 -j 4 --npb 

the lr_steps is 50 75 90 the totally epoch is 100. After training, we got 74.0% Top-1 accuracy under the validation scheme(one clip, center crop), and when perform the 30-view (3 crops, 10 clips) testing scheme, we finally got 76.6% Top-1 accuracy.

Thank you for your reply. I tryed the 30-view (3 crops, 10 clips) testing scheme and got 77.09% Top-1 accuracy when using num_segment 16.