isi-vista / unified-io-inference

Apache License 2.0
0 stars 0 forks source link

Fix Bug in VizWiz Testing #21

Closed danielnapierski closed 1 year ago

danielnapierski commented 1 year ago

I ran captioning on 500 VizWiz images from the validation set.

SPICE: 0.120
computing METEOR score...
METEOR: 0.179
Bleu_1: 0.546
Bleu_2: 0.362
Bleu_3: 0.231
Bleu_4: 0.140
ROUGE_L: 0.370
CIDEr: 0.263

I have a PR to submit. I'll run on images from the test set today.

danielnapierski commented 1 year ago

We now can run val or train and get results ourselves (val outputs):

SPICE evaluation took: 12.42 s
SPICE: 0.122
computing METEOR score...
METEOR: 0.170
Bleu_1: 0.553
Bleu_2: 0.365
Bleu_3: 0.228
Bleu_4: 0.136
ROUGE_L: 0.355
CIDEr: 0.341
SPICE: 0.122
METEOR: 0.170

Scoring test can only be done through leaderboard. We can currently run test images and generate output. We must caption exactly 4000 test images to appear on the leaderboard.