Open bxwldljh opened 9 months ago
Thank you for your interest in our work. We have employed ViT and BERT from OpenAI's CLIP as feature extractors. As a result, we needed to modify the source code of CLIP, which can be seen at https://github.com/openai/CLIP/blob/main/clip/model.py
(line 235
for images, line 354
for text). You may refer to my src/tools/extract_embedding.py
for the model loading and forward process.
I forgot to mention one point: please do not apply normalization to the features, as it would result in loss of information.
can you release the code of video and text feature extraction? many thanks to you!