Open yotammarton opened 1 year ago
Same question, I can download the videos without annotations. Where can I get the text(caption, annotation, transcription) data? Thanks a lot
First of all - Amazing work on this one.
I'm a bit getting lost with the repo, may I request a simple few line script that does something like the following:
model = CLIPViP("pretrain_clipvip_base_32.pt") text_features = model.encode_text("This is a very cute cat") video_features = model.encode_video("vid_file.mp4") cosine(text_features, video_features)
[Extra] Preferably I wish to get the video features for a batch of mp4 files with different lengths The closest I found is in
CLIP-ViP/src/modeling/VidCLIP.py
but I couldn't find a use of this script.Thank you :)
Hi, we are intergrating CLIP-ViP into Huggingface transformers. I believe it will be more easily called. Please keep an eye on it.
Same question, I can download the videos without annotations. Where can I get the text(caption, annotation, transcription) data? Thanks a lot
Hi, for ASR texts, please refer to #7 . For auxiliary captions, please download from this link: Azure Blob Link
Thanks a lot!
On Mon, Jul 3, 2023 at 5:24 PM HellwayXue @.***> wrote:
Same question, I can download the videos without annotations. Where can I get the text(caption, annotation, transcription) data? Thanks a lot
Hi, for ASR texts, please refer to #7 https://github.com/microsoft/XPretrain/issues/7 . For auxiliary captions, please download from this link: Azure Blob Link https://hdvila.blob.core.windows.net/dataset/hdvila_ofa_captions_db.zip?sp=r&st=2023-03-16T04:58:26Z&se=2026-03-01T12:58:26Z&spr=https&sv=2021-12-02&sr=b&sig=EYE%2Bj11VWfQ6G5dZ8CKlOOpL3ckmmNqpAtUgBy3OGDM%3D
— Reply to this email directly, view it on GitHub https://github.com/microsoft/XPretrain/issues/24#issuecomment-1617709079, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKIWUVABMN6IXKTE5DBSOT3XOKFWFANCNFSM6AAAAAAZMNCGAM . You are receiving this because you commented.Message ID: @.***>
Same question, I can download the videos without annotations. Where can I get the text(caption, annotation, transcription) data? Thanks a lot
Hi, for ASR texts, please refer to #7 . For auxiliary captions, please download from this link: Azure Blob Link
@HellwayXue Thanks for providing the auxiliary captions. But how to open the data.mdb files ? I tried Access and VisualStudio but they did not work...
First of all - Amazing work on this one. I'm a bit getting lost with the repo, may I request a simple few line script that does something like the following:
model = CLIPViP("pretrain_clipvip_base_32.pt") text_features = model.encode_text("This is a very cute cat") video_features = model.encode_video("vid_file.mp4") cosine(text_features, video_features)
[Extra] Preferably I wish to get the video features for a batch of mp4 files with different lengths The closest I found is in
CLIP-ViP/src/modeling/VidCLIP.py
but I couldn't find a use of this script. Thank you :)Hi, we are intergrating CLIP-ViP into Huggingface transformers. I believe it will be more easily called. Please keep an eye on it.
Hi @HellwayXue, any update on integration with HuggingFace? Thank you:)
@MVPavan @yotammarton i'v created a simple example here: https://github.com/eisneim/clip-vip_video_search
Hi @MVPavan can you please suggest what configuration of GPUs are required to run this model. ( just for making inference on it )
First of all - Amazing work on this one.
I'm a bit getting lost with the repo, may I request a simple few line script that does something like the following:
[Extra] Preferably I wish to get the video features for a batch of mp4 files with different lengths The closest I found is in
CLIP-ViP/src/modeling/VidCLIP.py
but I couldn't find a use of this script.Thank you :)