v-iashin / video_features

Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and TIMM models.
https://v-iashin.github.io/video_features
MIT License
471 stars 91 forks source link

About centor ctop #136

Open htowa opened 2 weeks ago

htowa commented 2 weeks ago

Hi.

When extracting frame images from a video, are you applying the center frame image to I3D? How can I apply the entire frame image to I3D?

v-iashin commented 2 weeks ago

hi,

it should be fine. try making sure you are getting the same output with your inputs as with the center crop in here: https://github.com/v-iashin/video_features/blob/1b67c9f8cfb44b61f6fae5fa1a89d34b7fe7a579/models/i3d/i3d_src/i3d_net.py#L257

you can tweak pre-processing (crop etc) here:

also make sure you are getting good logits by specifying show_pred=True in args.

htowa commented 2 weeks ago

If the input video size is H=240, W=320, is it possible to resize it to 224, 224 and input it into I3D?

v-iashin commented 2 weeks ago

I think it is. Change the min side size to 224.

https://github.com/v-iashin/video_features/blob/1b67c9f8cfb44b61f6fae5fa1a89d34b7fe7a579/models/i3d/extract_i3d.py#L45