xichenpan / ARLDM

Official Pytorch Implementation of Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models
https://arxiv.org/abs/2211.10950
MIT License
182 stars 28 forks source link

Regarding the data of the VIST Dataset #13

Open SaulZhang opened 1 year ago

SaulZhang commented 1 year ago

Hi @xichenpan When I tried to reproduce the experiment on the VISTdataset, I noticed that there are numerous duplicate story images in the testing set as illustrated in the figure below, although their text descriptions differ. Is this because some image URLs were inaccessible during the download process? I utilized the vist_img_download.py script to download a total of 184011images, but I am unsure if some images may have been missing. Would it be possible for you to share the vist.h5 file through Google Drive? QQ截图20230306160245

xichenpan commented 1 year ago

Hi @SaulZhang , sorry for the delayed reply, I was busy with ICCV last week. The VIST dataset does contain duplicate images. It is due to for a same visual stories, there are multiple captions, so your downloaded images are actually correct. Unfortunately, I do not have a vist.h5 file, because we use nas table in Alibaba for data storage, and these h5 scripts are only written for users to accelerate IO. Hope this can help!