microsoft / FIBER

Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone
MIT License
126 stars 11 forks source link