Open SaulZhang opened 1 year ago
Hi @SaulZhang , sorry for the delayed reply, I was busy with ICCV last week. The VIST dataset does contain duplicate images. It is due to for a same visual stories, there are multiple captions, so your downloaded images are actually correct. Unfortunately, I do not have a vist.h5 file, because we use nas table in Alibaba for data storage, and these h5 scripts are only written for users to accelerate IO. Hope this can help!
Hi @xichenpan When I tried to reproduce the experiment on the![QQ截图20230306160245](https://user-images.githubusercontent.com/25783335/223054029-616db720-b15c-4b07-b18b-24401238b1b4.png)
VIST
dataset, I noticed that there are numerous duplicate story images in the testing set as illustrated in the figure below, although their text descriptions differ. Is this because some image URLs were inaccessible during the download process? I utilized thevist_img_download.py
script to download a total of184011
images, but I am unsure if some images may have been missing. Would it be possible for you to share thevist.h5
file through Google Drive?