openai / CLIP

CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
MIT License
24.55k stars 3.2k forks source link

About rope-vit #454

Open CoinCheung opened 1 month ago

CoinCheung commented 1 month ago

Will future releases include rope-vit? I see in the paper that rope-vit is advantageous to naive vit.