jfan1256 / distill-blip

Distill BLIP (Knowledge-Distillation for Image-Text Deep Learning Tasks). Supports pretraining and caption/retrieval finetuning on Multi-GPU or Single-GPU training for On Prem and Cloud VM. Handles preprocessing datasets, which are downloaded using Img2Datasets for CC3M, COCO, Flickr30k, and VGO.
MIT License
0 stars 1 forks source link