GVF features generation

HumamAlwassel / TSP

TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks (ICCVW 2021)

MIT License

107 stars 16 forks source link

Hi @satish1901,

I'm doing well, thanks for asking :)

The GVF is generated by max-pooling the Kinetics-pretrained clip features of one video across time. Here are the steps you need to do to generate the GVF for a new dataset:

Preprocess the videos following the instructions in TSP/data.
Run the feature extraction code using the released model r2plus1d_34-tac_on_kinetics (i.e. a Kinetics-pretrained model).
Merge the features into an H5 file following the instructions here.
Iterate over the features of each video and apply max-pooling across time to get the GVF. Save the results in a new H5 file and you are good to go.

Cheers!

HumamAlwassel / TSP

GVF features generation #3