Closed xijun-cs closed 4 months ago
Hello.
Thank you for your interest in our work. For feature extraction, we suggest you to take a look at Official github repo for CLIP and Slowfast!
Thanks
Thanks for your reply.
For double-check, is the clip model ViT-B/32 and slow-fast model SLOWFAST_8x8_R50?
Thanks
Thanks for your reply.
For double-check, is the clip model ViT-B/32 and slow-fast model SLOWFAST_8x8_R50?
Thanks
I want to know it too~
You might want to check out issue https://github.com/jayleicn/moment_detr/issues/19
And for the clip model yes we used clip b 32 model
Thanks, I found my way (ORZ): Your job ->Umt->Moment-DETR->HERO_Video_Feature_Extractor
And for the clip model yes we used clip b 32 model
Hi, the visual and text encoder are frozen, just use the offline features, right? Thank you!
And for the clip model yes we used clip b 32 model
Hi, the visual and text encoder are frozen, just use the offline features, right? Thank you!
Yes you are right
Hi,
Thanks for your awesome work!
I wanna try your method on a customized dataset, which is similar to Charades-STA.
How can I extract the Slowfast + CLIP feature and the text feature on my own?
Best