Closed shinying closed 3 years ago
Hi @shinying,
Sorry for the late reply and thank you for your interests in our project. Our feature extraction code is now released: HERO_Video_Feature_Extractor. The feature pre-processing code (conversion to lmdb) is also updated under here: 1b5d4a4a13ea222fa81ecc623ed58ffeb98b1fa9.
Thanks, Linjie
Hi, @linjieli222
Thanks for your reply and the early release. I have read the code and found the extractor is well designed for extracting features from a video, but I only have video frames as input and thus try modifying video_loader.py
. I am wondering if you have any suggestion on setting target_framerate
and clip_len
.
The question is mainly related to preprocessing.py
, where, if I understand correctly, extracted frames are padded and sampled to form ceil(the number of extracted frames / (clip_len * target_framerate))
sequences of frames, each with size num_frames
. This operation reduce the number of features from the number of frames to a much smaller number, and I found difficult to interpret such result as frame-level features.
Thanks for your help. Shinying
Hi Shinying,
I believe I have answered both questions in another thread.
Another possible solution:
In script/convert_videodb.py, we concatenate the slowfast features with the 2d-resnet features and save into a lmdb file. If you already have features extracted, but with a larger # of features per 1.5 seconds
, you can downsample them into 1 feature/1.5 seconds
(if clip_len
is set as 1.5s). However, you may expect some performance degradation as our pre-training features are extracted with high framerate.
Hi Linjie,
Thanks for your reply and help again.
Shinying
Hi,
Thanks for making your great work open-sourced. I am trying to do feature extraction myself, and wondering how frame-level feature is encoded with SlowFast. As pre-trained SlowFast receives a fixed number of frames as input for action recognition, did you sample multiple clips from a video at different location, or perform other operations such as pooling or concatenation?
I look forward to your reply.