facebookresearch / SlowFast

PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.
Apache License 2.0
6.61k stars 1.21k forks source link

How to reproduce the Rev-ViT validation accuracy #618

Open tonysy opened 2 years ago

tonysy commented 2 years ago

Hi, Could you provide the script to reproduce the Rev-ViT validation accuracy?

I use the following command and cannot get the right accuracy.

python tools/run_net.py --cfg configs/ImageNet/REV_VIT_S.yaml \
DATA.PATH_TO_DATA_DIR data/imagenet/ \
TRAIN.ENABLE False \
TEST.CHECKPOINT_FILE_PATH ckpt/REV_VIT_S.pyth

I can run the above command only when comment the following lines https://github.com/facebookresearch/SlowFast/blob/5b5d9ecb15a54188943af0cbf5f7c420d8409018/tools/test_net.py#L131-L135 and https://github.com/facebookresearch/SlowFast/blob/5b5d9ecb15a54188943af0cbf5f7c420d8409018/tools/test_net.py#L225-L229

Finally :

[10/21 01:48:30][WARNING] meters.py:  390: clip count Ids=tensor([[   0,    1,    2,  ..., 1663, 1664, 1665]]) = tensor([0, 0, 0,  ..., 0, 0, 0]) (should be 30)                                                                                                                                                        
[10/21 01:48:30][INFO] logging.py:   99: json_stats: {"split": "test_final", "top1_acc": "0.00", "top5_acc": "100.00"}                                      [10/21 01:48:30][INFO] test_net.py:  261: Finalized testing with 10 temporal clips and 3 spatial crops                                                      [10/21 01:48:30][INFO] test_net.py:  283: _p22.43_f4.58_10a0.00 Top5 Acc: 100.00 MEM: 1.52 f: 4.5842

Could you give some hints? Thanks @haooooooqi @karttikeya

tonysy commented 2 years ago

@karttikeya Hi, I have sent you e-mail for this issue, but got no response, I'm curious about do you have any plan for this issue? @haooooooqi @lyttonhao Would you like to follow this issue and provide some suggestions?

karttikeya commented 2 years ago

Hi @tonysy,

Thanks for your interest in our work. The issue is arising because of the use of the test function. Since ImageNet has only a public validation set (and no test set), so we use the training+validation setting for validation.

If you wish only to run the eval, please simply comment out the call to train_epoch

https://github.com/facebookresearch/SlowFast/blob/5b5d9ecb15a54188943af0cbf5f7c420d8409018/tools/train_net.py#L681-L690

and run the training+validation script as

python tools/run_net.py --cfg configs/ImageNet/REV_VIT_S.yaml \
DATA.PATH_TO_DATA_DIR data/imagenet/  \
TRAIN.CHECKPOINT_FILE_PATH path_to_rev_vit_s_checkpoint

It reproduces the result perfectly as I just tried it myself. image

We will try to directly support reversible validation in future.

PS: I double-checked my inbox but seems like I did not receive your email. No worries, hope this solves the issue.

tonysy commented 2 years ago

Thanks for the reply, I will try your solution. One more question, can I use this configuration file configs/ImageNet/REV_VIT_S.yaml to train the model and reproduce the accuracy?

karttikeya commented 2 years ago

Yes.

Additionally, here's a few days old 80.4 REV ViT S model trained with configs/ImageNet/REV_VIT_S.yaml along with the training log, so it is easier to reproduce.

Model + Log: https://drive.google.com/drive/folders/1JYSpejA4ckOdEbbw3PFHrzT4YugiZTJU?usp=share_link

tonysy commented 2 years ago

Great, many thanks.

tonysy commented 1 year ago

Yes.

Additionally, here's a few days old 80.4 REV ViT S model trained with configs/ImageNet/REV_VIT_S.yaml along with the training log, so it is easier to reproduce.

Model + Log: https://drive.google.com/drive/folders/1JYSpejA4ckOdEbbw3PFHrzT4YugiZTJU?usp=share_link

Hi, @karttikeya The accuracy at 10 epoch is very high in this log, does it means you load trained model to get 80.4?

karttikeya commented 1 year ago

Hi @tonysy,

Thanks for noticing the erroneous run. I've retrained another model in the last few days with the current codebase and replaced the files on the same link with the correct log and Rev-ViT-S model with 79.75 acc. The model is trained from scratch.

Hopefully you've been able to to reproduce the validation accuracy for the model provided in the model zoo with the aforementioned procedure.