train and release models

TheoCoombes / ClipCap

Using pretrained encoder and language models to generate captions from multimedia inputs.

94 stars 15 forks source link

Open rom1504 opened 2 years ago

rom1504 commented 2 years ago

to begin with, train on coco, and have clipcap.load_pretrained("coco_global_vit_b_32")