facebookresearch / SlowFast

PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.
Apache License 2.0
6.59k stars 1.21k forks source link

Correct config to reproduce ssv2 result on multigrid #256

Open mars747 opened 4 years ago

mars747 commented 4 years ago

Hi,

I'm trying to reproduce the multigrid result on ssv2. The config I'm using is: configs/SSv2/SLOWFAST_16x8_R50_multigrid.yaml

The pretrained checkpoint I'm using is: SLOWFAST_8x8_R50.pkl per: https://github.com/facebookresearch/SlowFast/blob/master/MODEL_ZOO.md

Is the config and checkpoint matched correctly? Per my understanding of the paper

  1. The number of frames sampled by the Slow pathway as T.
  2. The key concept in our Slow pathway is a large temporal stride τ on input frames. The raw clip length is T × τ frames.

If the number of frames processed by the slow pathways are different, then the config SLOWFAST_16x8_R50_multigrid cannot possibly use a 8x8 pretrained model?

With this mismatched configuration, I'm only able to achieve "min_top1_err": 43.042776, "min_top5_err": 15.907990

Thank you for your explanation!

chaoyuaw commented 4 years ago

Hi @mars747,

To obtain the multigrid SS-v2 models we released in MODEL_ZOO, we indeed used the released SLOWFAST_8x8_R50.pkl (77.0 top-1) model as the pre-trained model for simplicity. It works because 3D CNNs are fully convolutional, and empirically I find them not very sensitive to the number of frames and temporal strides, especially when fine-tuning is performed.

I'd guess that one possible source of the issue could be related to data. Were you able to run the data processing steps described in slowfast/datasets/DATASET.md successfully? It might worth also double checking to see if the frames you used are consistent with the "frame list".

mars747 commented 4 years ago

Thank you for the reply @chaoyuaw. I did follow the instructions and there were no errors (some warnings though). I think the list is consistent as the dataloader outputs warnings if files are not read in right?

I have the following training curve of training epoch top_1 error, does this look the same as yours?

image

If the training result is different, then it's more likely in the data. Otherwise, there's something wrong with my validation?

Thanks again!

chaoyuaw commented 4 years ago

Hi @mars747 , thanks for providing more context. I don't have access to the learning curves I had at this moment, so I can't really tell whether it's the same as what we had before. But qualitatively it looks reasonable to me. Have you ever tried to run only evaluation (without training) using the final, trained model from MODEL ZOO? If that gives a different top-1, then it might suggest data issues.

While indeed if a frame is missing, we should see warning messages, there could be other reasons that might cause data-related issues, e.g. different frame rates, different frame extraction qualities, etc. Maybe comparing the number of frames you have and the number of frames in the frame list, or eyeballing the frames themselves might help?

Also, another possibility to debug might be to turn off multigrid training to see if you are able to get the expected results.

youngwanLEE commented 4 years ago

@chaoyuaw @mars747 Hi,

I also tested SLOWFAST_16x8_R50_multigrid (no self-trained, just downloaded from the MODEL ZOO) and got this result which is lower than the reported result (63.5 | 88.7).

INFO] slowfast.utils.logging: 96: json_stats: {"split": "test_final", "top1_acc": "58.51", "top5_acc": "85.13"}

I just followed the dataset preparation by using ffmpeg -i "${video}" -r 30 -q:v 1 "${out_name}" command.

The number of extracted all frames is 25,209,271.

Is the number of all frames the same as yours?

I used 8 GPUs and didn't modify the config.yaml and it ran 1,549 iterations.

Isn't there any problem with the data because I don't see any warning message?

[INFO: ssv2.py: 70]: Constructing Something-Something V2 test... [07/29 21:45:21][INFO] slowfast.datasets.ssv2: 70: Constructing Something-Something V2 test... [INFO: ssv2.py: 155]: Something-Something V2 dataloader constructed (size: 24777) from /Data_ssd/sth-sth-v2/20bn-something-something-v2-SF-frames/val.csv [07/29 21:45:30][INFO] slowfast.datasets.ssv2: 155: Something-Something V2 dataloader constructed (size: 24777) from /Data_ssd/sth-sth-v2/20bn-something-something-v2-SF-frames/val.csv [INFO: test_net.py: 147]: Testing model for 1549 iterations [07/29 21:45:30][INFO] test_net: 147: Testing model for 1549 iterations

My Environment setting :

Ubuntu 16.04 python3.7 pytorch 1.5.0 torchvison 0.6.0a0+82fd1c8

My command line :

python tools/run_net.py --cfg configs/SSv2/SLOWFAST_16x8_R50_multigrid.yaml DATA.PATH_TO_DATA_DIR /Data_ssd/sth-sth-v2/20bn-something-something-v2-SF-frames DATA.PATH_PREFIX /Data_ssd/sth-sth-v2/20bn-something-something-v2-SF-frames/ TEST.CHECKPOINT_FILE_PATH ./SLOWFAST_16x8_R50_multigrid.pkl TRAIN.ENABLE False TRAIN.CHECKPOINT_TYPE pytorch

mars747 commented 4 years ago

Hi @youngwanLEE, @chaoyuaw,

My test result using 4 GPUs is the following:

[INFO: logging.py: 96]: json_stats: {"split": "test_final", "top1_acc": "58.44", "top5_acc": "85.14"} [07/24 18:08:42][INFO] slowfast.utils.logging: 96: json_stats: {"split": "test_final", "top1_acc": "58.44", "top5_acc": "85.14"}

I'm using the same command ffmpeg as well. The number of images I think I have is: 25209274

youngwanLEE commented 4 years ago

@mars747

Did you solve this problem?

youngwanLEE commented 4 years ago

@chaoyuaw

I'm sorry to bother you,

Wound you mind checking the SlowFast_16x8_R50_on_SSv2_standard model's accuracy?

I tried to reproduce the reported result(38.9) several times.

I created SSv2 frames several times but got image.

I also follow your data instructions.

Thanks for your help in advance.

mars747 commented 4 years ago

@mars747

Did you solve this problem?

@youngwanLEE , I did not investigate further and I have since started exploring other directions.

haooooooqi commented 4 years ago

Hi @youngwanLEE,

I'll take a look and let you know the update, thanks for sharing the info. @chaoyuaw I'll take it from there. Thanks,

youngwanLEE commented 4 years ago

@mars747 @chaoyuaw @takatosp1

By using TEST.NUM_SPATIAL_CROPS 3 instead of 1,

I got 63.93/88.16.

Didn't the reported result (63.0/88.5) come from using NUM_SPATIAL_CROPS 3?

tonysy commented 3 years ago

@youngwanLEE I have got the similar results as your reported For NUM_SPATIAL_CROPS=1:

json_stats: {"split": "test_final", "top1_acc": "57.88", "top5_acc": "84.13"}

For NUM_SPATIAL_CROPS=3:

json_stats: {"split": "test_final", "top1_acc": "63.92", "top5_acc": "88.15"}