Closed Rj-batista closed 1 year ago
HERO_VIDEO_FEATURE_EXTRACTOR is a simple codebase to extract features for the video feature (including CLIP and slowfast) we used in this codebase.
Ok, it's clearer for the first question i appreciate your prompt response. Is it possible for you to tell me which file match from the HERO_VIDEO_FEATURE_EXTRACTOR and what you give us to download on the Prepare feature files section from your README, because there is no indication about what file is at the output of HERO_VIDEO_FEATURE_EXTRACTOR From HERO_VIDEO_FEATURE_EXTRACTOR with my own video:
From moment_detr_features.tar.gz
Thank you
All the 3 features listed are from clip-vit_feature,clip_sub_feature
is only used for pre-training, clip_feature
is video frame features , clip_text_feature
is the feature for user text queries. Besides, you can actually look at our inference example here to figure out what exact features are used https://github.com/jayleicn/moment_detr/tree/main/run_on_video, but note this is a simplified model which does not use slowfast features for videos.
Iam sorry it's still confusing for me because i want to train your model with my own data
How do you generate clip_feature
, clip_sub_feature
, clip_text_feature
from HERO_Video_Feature_Extractor
I presume that you need to use this section Image-text pre-trained CLIP features of the repo to generate those 3 files
The issue is that after running the docker image with my own data I am left with only one npz file in a folder called
clip-vit_feature
So how do you generate those folders in order to train your model ?
Thank you for your reponse
clip_feature
is the vision feature clip-vit_feature
. clip_sub_feature
and clip_text_feature
are both text features, you will need to create your own script to extract, following what shown in this demo. clip_text_feature
is the extract CLIP text feature for user query, clip_sub_feature
is the extracted text feature for video subtitles, they are from the same CLIP text encoder.
Ok so clip_feature
is generate with clip-vit_feature
.
For clip_sub_feature
and clip_text_feature
should i use this script to generate feature them ?
Thanks
Sry for the delay but i have found how it work now Thanks for all of your answer and great work btw !!
Sry for the delay but i have found how it work now Thanks for all of your answer and great work btw !!
Excuse me, would you be willing to share the code for extracting qvhighlights text features?
Hi,
I am a little confuse about feature extraction If I am correct there is two kind of features : CLIP OPEN AI and HERO_VIDEO_FEATURE_EXTRACTOR I wanted to know the difference between those two and the purpose of CLIP ? Also I have run HERO_VIDEO_FEATURE_EXTRACTOR and i am left with 4 files :
Thank you