Hi, @jshilong @PeizeSun @ShoufaChen
I would like to ask some questions about "Table 4: Compariation of region caption ability on the validation dataset on Visual Genome".
Do you divide the validation dataset for VG region caption task by yourselves?
In the original VG dataset, it seems that there is no validation split.
Could you please provide a link or a README to the validation dataset with me?
Do you reproduce the result of GRiT?
In GRiT's paper, it also seems that there is no related experimental result (e.g., CIDEr for the validation dataset for VG region captioning).
Could you provide more details about this experiment?
Hi, @jshilong @PeizeSun @ShoufaChen I would like to ask some questions about "Table 4: Compariation of region caption ability on the validation dataset on Visual Genome".
Do you divide the validation dataset for VG region caption task by yourselves? In the original VG dataset, it seems that there is no validation split. Could you please provide a link or a README to the validation dataset with me?
Do you reproduce the result of GRiT? In GRiT's paper, it also seems that there is no related experimental result (e.g., CIDEr for the validation dataset for VG region captioning). Could you provide more details about this experiment?
Thank you in advance.