openai / CLIP

CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
MIT License
26.1k stars 3.33k forks source link

About rope-vit #454

Open CoinCheung opened 4 months ago

CoinCheung commented 4 months ago

Will future releases include rope-vit? I see in the paper that rope-vit is advantageous to naive vit.