guilk / VLC

Research code for "Training Vision-Language Transformers from Captions Alone"
33 stars 4 forks source link