HumamAlwassel / TSP

TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks (ICCVW 2021)
http://humamalwassel.com/publication/tsp/
MIT License
107 stars 16 forks source link

GVF features generation #3

Closed satish1901 closed 3 years ago

satish1901 commented 3 years ago

Hi Humam How are you doing ? I have a quick follow up question on training on new dataset. How do i generate the GVF features to train on my data? I see its an optional argument but if i want to use it then how?

HumamAlwassel commented 3 years ago

Hi @satish1901,

I'm doing well, thanks for asking :)

The GVF is generated by max-pooling the Kinetics-pretrained clip features of one video across time. Here are the steps you need to do to generate the GVF for a new dataset:

  1. Preprocess the videos following the instructions in TSP/data.
  2. Run the feature extraction code using the released model r2plus1d_34-tac_on_kinetics (i.e. a Kinetics-pretrained model).
  3. Merge the features into an H5 file following the instructions here.
  4. Iterate over the features of each video and apply max-pooling across time to get the GVF. Save the results in a new H5 file and you are good to go.

Cheers!