X-PLUG / mPLUG

mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections. (EMNLP 2022)
https://arxiv.org/abs/2205.12005
81 stars 6 forks source link

Zero-shot Video Captioning Problem #3

Closed BruceChen15 closed 1 year ago

BruceChen15 commented 1 year ago

Hi, I encounter a problem. I think you did not release the ViT-L-14 or ViT-B-16 models. Can you slove these problem? thank you image

MAGAer13 commented 1 year ago

Please refer to the original version of CLIP.