Data for Metrics on the Flickr 1K Test

google-research-datasets / conceptual-captions

Conceptual Captions is a dataset containing (image-URL, caption) pairs designed for the training and evaluation of machine learned image captioning systems.

Other

516 stars 26 forks source link

Data for Metrics on the Flickr 1K Test #7

Closed josephius closed 3 years ago

josephius commented 5 years ago

Would it be possible to release just the data (raw output captions generated by the models) used specifically to make Table 4 and Table 7 from the paper? I have an idea for a new automatic metric, and would like to test whether or not it does a better job of capturing human-like evaluations, or performs more like the usual automatic metrics.

sharma-piyush commented 5 years ago

Hi,

Sorry for a late response.

Conceptual Captions test set is hidden, so we are unable release model outputs on that set. https://github.com/google-research-datasets/conceptual-captions#hidden-test-set

There is a workshop at CVPR'19. We will release models' outputs and human judgments for the T2 test set in June'19. That might be helpful for your work. Please see this page for more info. http://www.conceptualcaptions.com/challenge

Regards, Piyush

sharma-piyush commented 5 years ago

Sorry, I misunderstood your question. You are asking about Flickr1k Test, and not Conceptual Captions Test (Table 7). I will get back to you on this shortly.

Regarding human evals for Table 4, we are unable to release individual caption ratings on Flickr1k test at this time. But we plan to release outputs from multiple models on a different set of 1K test images, and the corresponding human judgments in June as I mentioned above.