OpenGVLab / unmasked_teacher

[ICCV2023 Oral] Unmasked Teacher: Towards Training-Efficient Video Foundation Models
https://arxiv.org/abs/2303.16058
MIT License
267 stars 13 forks source link

multi-modal Video-Text Retrieval demo #7

Closed sportzhang closed 10 months ago

sportzhang commented 10 months ago

I would like to ask, is there a specific demo of the multi-modal text search video? That's the vectorization part.

Andy1621 commented 10 months ago

I do not provide the demo. However, you can follow the script for zero-shor retrieval to realize the function: https://github.com/OpenGVLab/unmasked_teacher/tree/main/multi_modality/exp/zero_shot