TheoCoombes / ClipCap

Using pretrained encoder and language models to generate captions from multimedia inputs.
95 stars 13 forks source link

Inference metrics: Bleu, METEOR, ROUGE_L, CIDEr, SPICE #1

Closed igor0 closed 2 years ago