First of all, thanks for the interesting work and the useful benchmark!
I have a new text-to-image generative model, and I want to make sure that I evaluate and compare it properly (as I aim to report it in a paper):
First, I generate the 300 samples from texture_val.txt along with their index.
Next, I ran the BLIP_vqa script.
I saw that running with and without the index number (caption_index.png) significantly changes the results. Should it be this way? If so, the index should be set according to the order in the txt files?
Hello,
First of all, thanks for the interesting work and the useful benchmark!
I have a new text-to-image generative model, and I want to make sure that I evaluate and compare it properly (as I aim to report it in a paper): First, I generate the 300 samples from texture_val.txt along with their index. Next, I ran the BLIP_vqa script.
I saw that running with and without the index number (caption_index.png) significantly changes the results. Should it be this way? If so, the index should be set according to the order in the txt files?
Thanks a lot!