Wangt-CN / EqBen

[ICCV'23 Oral] The introduction and toolkit for EqBen Benchmark
Apache License 2.0
125 stars 1 forks source link

Is there a validation set for EqBen? #1

Closed linzhiqiu closed 1 year ago

linzhiqiu commented 1 year ago

And why is the zip file so big? Is there a way to reduce its size, eg., downsampling the images? Otherwise it is very costly to conduct experiments on our small server..

linzhiqiu commented 1 year ago

Also, is it possible to release the dataset in Winoground format (e.g, each batch is an image_0, image_1, caption_0, caption_1)?

Wangt-CN commented 1 year ago

Hi Zhiqiu,

Thanks a lot for your interest in our work. Indeed, our full data annotation is not publicly available. You can use the annotation file and the example code to get the results and upload to the codalab server.

Thanks for letting us know about the huge cost of your server with the current data scale. We are actively working to release a public sub-dataset with label sampling from the full test set (maybe around 20%) in a few days to make it more efficient for visualization and validation.

linzhiqiu commented 1 year ago

Thanks for the update! Could you explain what 'private_info' means in your annotation file? Is that the ground truth label?

I think you should be able to convert the images to JPEG format and save a lot of memory without losing performance? 100GB is quite expensive for most academic labs.

Wangt-CN commented 1 year ago

Yes, it means the ground truth label. We have just uploaded the EqBen subset (25K image-text pair) for ease of validation and visualization. Please check it out.

If you have any other questions, feel free to re-open this issue.