Run captioning on VizWiz sample and score using VizWiz tools

danielnapierski commented 1 year ago

Expectations

1 day dev to complete script additions for JSON formatting of VizWiz IO.
6 hours of run and debug time to get ~500 results
2 hours to run scoring using VizWiz scoring tools.

danielnapierski commented 1 year ago

First set of results from VizWiz. I used 100 images from the TRAIN set. I know that the Unified-IO system trained on this data, so this is not a test result. I will work to get test results next, however the format of the VizWiz dataset made it most straight forward to use the training set as the images and association captions were readily available. When I tried to use the test or validation data set (not training) I had trouble matching up ids to image files to captions. I will work on that or raise up a question if I get blocked.

Here are the results of running Unified-IO baseline captioning over 100 images from the VizWiz training set.

Bleu_1: 0.924
Bleu_2: 0.906
Bleu_3: 0.893
Bleu_4: 0.882
ROUGE_L: 0.896
CIDEr: 2.480
SPICE: 0.372
METEOR: 0.604

This was calculated using eval-vizwiz.py. I will create a PR and link it here. I will also document the different scoring metrics. @marjorief @elizlee @yash-reddy

danielnapierski commented 1 year ago

PR (in progress while I dig into test/validation and change some hardcoded paths to be variables) https://github.com/isi-vista/unified-io-inference/pull/18

danielnapierski commented 1 year ago

Two references to see regarding scoring metrics:

isi-vista / unified-io-inference

Run captioning on VizWiz sample and score using VizWiz tools #17

Expectations