google-deepmind / kinetics-i3d

Convolutional neural network model for video classification trained on the Kinetics dataset.
Apache License 2.0
1.74k stars 461 forks source link

Video preprocessing code #87

Open javithe7 opened 4 years ago

javithe7 commented 4 years ago

Is there a way you could share the code you use to preprocess the videos? I mean the aplication of the TVL1 optcal flow algorithm to look like the example gif you show us. I've been triying to replicate the preprocessing by myself, but it doesn't look exactly like yours, so when i do inference always give me wrong results.

Thank you Greetings

jahab commented 4 years ago

This is a .ipynb file to convert images sampled from the video to optical flow images. Hope this helps convert_to_flow.ipynb.zip

javithe7 commented 4 years ago

Thank you vey much @jahab , i've tested your code with a few changes and the result looks pretty good, however the background of the video is changing colors, unlike the example gif that maintains a static gray color background . This is a gif with de result :Video. Maybe they applied some extra filter to the video to achieve that kind of background?

jahab commented 4 years ago

@javithe7 The color is I believe for the purpose of visualization for the user. The output we are taking from the optical flow is just the first two channels. These two channels are then passed to the video classification file. I have further modified the code to add the [-20 , 20] truncation. Have a look at that as well. Attaching the updated file here convert_to_flow.ipynb.zip

Jockey721 commented 4 years ago

@javithe7 The color is I believe for the purpose of visualization for the user. The output we are taking from the optical flow is just the first two channels. These two channels are then passed to the video classification file. I have further modified the code to add the [-20 , 20] truncation. Have a look at that as well. Attaching the updated file here convert_to_flow.ipynb.zip

Hello, i find the norm_flow function in your code, and use it to norm my optical flows. But the predicted results from flow model are still different from the results of sample predictions.
And i notice that the provided v_CricketShot_g04_c01_flow.npy‘s min_value = -0.46 and max_value = 0.328, not from -1.0 to 1.0. They maybe use other preprocessing methods.?

Top 5 classes and associated probabilities:(RGB)

Top 5 classes and associated probabilities:(Flow)

===== Final predictions ==== logits proba class 2.343790e+01 9.997378e-01 playing cricket 1.332426e+01 4.051264e-05 faceplanting 1.314688e+01 3.392776e-05 kicking soccer ball 1.278005e+01 2.350939e-05 skateboarding 1.273522e+01 2.247873e-05 pushing car

mrdaly commented 3 years ago

@Jockey721

And i notice that the provided v_CricketShot_g04_c01_flow.npy‘s min_value = -0.46 and max_value = 0.328, not from -1.0 to 1.0. They maybe use other preprocessing methods.?

I was also confused about this, but after further investigation into the mediapipe repository to find the flow preprocessing, it looks like they rescale the flow data using -20 and 20 as the min/max, instead of the min/max of the actual flow data.