kenshohara / 3D-ResNets-PyTorch

3D ResNets for Action Recognition (CVPR 2018)
MIT License
3.9k stars 930 forks source link

About the accuracy of 94.5% on UCF101 using the Resnext101? #80

Open Wei2Huang opened 6 years ago

Wei2Huang commented 6 years ago

Hello, can you tell me how can achieve the the accuracy of 94.5% on UCF101 using the Resnext101? I use your code, the same network architecture(Resnext101) and your pretrained parameters(resnext-101-64f-kinetics-UCF101_split1.pth) by the manner of sliding windows over the sequeence of videos. But I can only got the accuracy of 90.35%. I set somthing wrong?

kenshohara commented 6 years ago

How do you evaluate the accuracy? If you get the accuracy in val.log, the accuracy is the clip-level accuracy whereas the accuracy reported in my paper is the video-level accuracy, which can be evaluated using val.json and eval_ucf101.py.

huangwei1995 commented 6 years ago

Oh, sorry, I used the 16-frames input so got the low accuracy. But if use the 64-frames input, how to deal with the video with the lower number of frames than 64? zero padding?

josueortc commented 6 years ago

Hello @Wei2Huang could you send the config you use (python main.py ...) for running testing on UCF101?

huangwei1995 commented 6 years ago

Hello @Wei2Huang could you send the config you use (python main.py ...) for running testing on UCF101? I have read your paper in details later and I found your method, so now I got the accuracy of 94.7%, thank you very much!

KT27-A commented 5 years ago

@kenshohara appreciate your work indeed! And @Wei2Huang would you please tell me the details of your training? I just can get 85% for val.log based on the setting on the paper. What is your training epochs? What is your lr_patience? What is your batch size? Thank you very much.

huangwei1995 commented 5 years ago

I also tried model resnext-101-64f-kinetics-UCF101_split1.pth, but got 85% val accuracy. I commented the fc reinitialized code and that was all I changed the code. Is my command wrong? My command is as follows: @kenshohara @huangwei1995 python main.py --root_path . --video_path UCF-101-1 --annotation_path ucf_list/ucf101_01.json --result_path results --dataset ucf101 --n_classes 101 --n_finetune_classes 101 --pretrain_path ../pretrained_model/resnext-101-64f-kinetics-ucf101_split1.pth --ft_begin_index 4 --model resnext --model_depth 101 --resnet_shortcut B --batch_size 128 --n_threads 4 --checkpoint 5 --n_epochs 200 --test --no_train

I don't achieve the results, and I just use the pre-trained model provided by the authors. When I try to realize it, I just got the 86%..

KT27-A commented 5 years ago

@huangwei1995 Nowadays I trained the network and reached the accuracy as the paper said. @kenshohara Thank you very much for your sharing, I really learned a lot. And I found one place inappropriate, for your information. In ucf101.py, row 82, the function make_dataset will generate more than 3 samples when 64f with validation. I added

if (j + sample_duration) >= n_frames:
      break

at the end of the function. The val_acc will increase 2% for its getting rid of recurrent samples.

sumeetssaurav commented 5 years ago

@Katou2 Could you please share the details of your config file which you used for fine-tuning using resnext-100 and resnext-101-64f version of the models on UCF-101. I want to know the value of the hyper parameters used in the training. With the help of the resenext-101 64f fine tuned model provided by the author I could achieve 93 percent accuracy of the Split1 of UCF-101. However, using my own fine-tuned model the accuracy hardly reaches 90 %.

sumeetssaurav commented 5 years ago

There is another issue in the code. Once you generate the val.json, the file contains prediction results of 1 video less than what contained in the val list of videos. I mean if I have 100 validation videos, the the generated val.json will have prediction results of only 99 videos. I still wonder, what could be the possible reason for such behaviour.

KT27-A commented 5 years ago

@sumeetssaurav I just utilized the default hyperparameters. Are you sure you tested at video level accuracy? For the second problem, I think you are right. The reason is that the last sample would not be tested in test.py. To settle this problem, you can add the following code at the end of dataloader function in test.py.

if i == len(data_loader) - 1:
    test_results = calculate_video_results(output_buffer, previous_video_id,
                                        test_results, class_names)
sumeetssaurav commented 5 years ago

@Katou2 Thanks for the quick reply. Still wondering where I have to put the loop suggested by you. Could you please highlight it in the test.py code.

ilovekj commented 5 years ago

Hello @Wei2Huang could you send the config you use (python main.py ...) for running testing on UCF101? I have read your paper in details later and I found your method, so now I got the accuracy of 94.7%, thank you very much!

excuse me, how to get 94.7 on validation?and could you tell me what is the difference between video level and clip level?

ilovekj commented 5 years ago

@Katou2 Could you please share the details of your config file which you used for fine-tuning using resnext-100 and resnext-101-64f version of the models on UCF-101. I want to know the value of the hyper parameters used in the training. With the help of the resenext-101 64f fine tuned model provided by the author I could achieve 93 percent accuracy of the Split1 of UCF-101. However, using my own fine-tuned model the accuracy hardly reaches 90 %.

hello, can you tell me how to achieve 93? I use the model, but i can only reach 90

KT27-A commented 5 years ago

@ilovekj Hi, utilizing the default config is enough. The key point is that you should calculate the video-level accuracy, namely using utils/eval_ucf101.py to calculate the accuracy.

ilovekj commented 5 years ago

@ilovekj Hi, utilizing the default config is enough. The key point is that you should calculate the video-level accuracy, namely using utils/eval_ucf101.py to calculate the accuracy.

so ,what is the difference between clip level and video level? i do not understand, can you tell me?

ilovekj commented 5 years ago

@ilovekj Hi, utilizing the default config is enough. The key point is that you should calculate the video-level accuracy, namely using utils/eval_ucf101.py to calculate the accuracy.

and when we validate on dataset, the accuracy is clip-level acc, right?

KT27-A commented 5 years ago

@ilovekj Yes, validation is the clip-level accuracy. Video-level accuracy is utilizing the average of the scores of all the clips of a video to get the accuracy, while clip-level accuracy is based on the score of a clip.

ilovekj commented 5 years ago

@Katou2 however we just average of the scores , how can we get higher accuracy, that is quite incredible

KT27-A commented 5 years ago

@ilovekj Try to think about it. In some cases, it can make it.

ilovekj commented 5 years ago

@Katou2 could you explain to me more clearly? thank you

ilovekj commented 5 years ago

@ilovekj Try to think about it. In some cases, it can make it.

and can you give me your WeChat or facebook?

ilovekj commented 5 years ago

@Katou2 excuse me? this question is very important for me, and i cannot get it

slighting666 commented 4 years ago

@Katou2 @kenshohara Great discussion.What I want to ask is how do I get the video accuracy and what is input with eval_ucf101.py, I don't see val.json.I only have val.log here

Purav-Zumkhawala commented 3 years ago

@kenshohara @Katou2 Is there a way I can resume training on my fine-tuned model. I used the following command to train till the 20th epoch.

python main.py --root_path C:\Users\purav\Downloads\Study\Project\data --video_path extracted_jpg --annotation_path ucf101_01.json --result_path results --dataset ucf101 --n_classes 101 --n_pretrain_classes 400 --pretrain_path C:\Users\purav\Downloads\Study\Project\3D-ResNets-PyTorch\models\resnet-50-kinetics.pth --ft_begin_module fc --model resnet --model_depth 50 --batch_size 128 --checkpoint 5

Then I had to stop the training for other important task and now when I try to start the training again using the saved model using the following command:

python main.py --root_path C:\Users\purav\Downloads\Study\Project\data --video_path extracted_jpg --annotation_path ucf101_01.json --result_path results --dataset ucf101 --resume_path C:\Users\purav\Downloads\Study\Project\data\results\save_20.pth --n_classes 101 --model_depth 50 --batch_size 16 --checkpoint 5

I face the following error :

Traceback (most recent call last):
  File "main.py", line 428, in <module>
    main_worker(-1, opt)
  File "main.py", line 363, in main_worker
    opt.resume_path, opt.begin_epoch, optimizer, scheduler)
  File "main.py", line 105, in resume_train_utils
    optimizer.load_state_dict(checkpoint['optimizer'])
  File "C:\Users\purav\anaconda3\envs\vision\lib\site-packages\torch\optim\optimizer.py", line 111, in load_state_dict
    raise ValueError("loaded state dict has a different number of "
ValueError: loaded state dict has a different number of parameter groups

Am I missing some parameters while resuming the training again?

Please Advise

guilhermesurek commented 3 years ago

@Purav-Zumkhawala See this issue #42. You need to keep referencing the pretrained info and just add the resume path. Try this:

python main.py --root_path C:\Users\purav\Downloads\Study\Project\data --video_path extracted_jpg --annotation_path ucf101_01.json --result_path results --dataset ucf101 --n_classes 101 --n_pretrain_classes 400 --pretrain_path C:\Users\purav\Downloads\Study\Project\3D-ResNets-PyTorch\models\resnet-50-kinetics.pth --ft_begin_module fc --model resnet --model_depth 50 --batch_size 128 --checkpoint 5 --resume_path C:\Users\purav\Downloads\Study\Project\data\results\save_20.pth
Purav-Zumkhawala commented 3 years ago

@guilhermesurek Thank you that worked

YTHmamba commented 2 years ago

@Katou2 @kenshohara Great discussion.What I want to ask is how do I get the video accuracy and what is input with eval_ucf101.py, I don't see val.json.I only have val.log here

hello,Are you using the code for the master branch? There are no val.json and eval_ucf101.py files in this branch