Closed bryanyzhu closed 5 years ago
I have tested this model on the kinetics-400 validation set and got very similar results to what was reported in the paper. I'm guessing your input preprocessing is slightly different than what this model was trained for. I.e., make sure you're sampling at 25 fps, resizing the height of each video to 256, saving as jpeg, then taking a center crop from that.
You can check your preprocessed video against the provided numpy version until you find the right settings.
Thank you for your suggestions, I will try it.
Hi @piergiaj , thank you for this great repo. I have used your code to extract I3D features before, it works pretty well. Recently, I want to train it from scratch on Kinetics400/700 and try to reproduce the performance. The first step is to evaluate your model on Kinetics400 validation set to see what is the performance . However, the accuracy is very low.
Then I try to find the reason. The situation is, if I used your .pth model and the demo .npy file, I can get the prediction of
CricketShot
correctly.But if I use
(imgx/255.)*2 - 1
to preprocess the same video (v_CricketShot_g04_c01) to get the input data, I can't get the correct prediction. The label I get is actuallyrobot dancing
. I also tried several more videos, but none of them give me the correct prediction.IMO, I think your model is good. Then the reason should be on the data side. It is either my decoded frames are not the same as yours, or the image preprocessing is kind of tricky. Have you encountered this before? I mean did you test your model on the Kinetics400 validation set? Thank you very much and look forward to your reply.