Open bhattg opened 1 year ago
Thanks for pointing this out, we do have a small number of image files in our dataset that don't exist, we'll be fixing this in the next version, you can skip these invalid images for now.
Hi,
I am trying to reproduce the 1k and 4k numbers for the ImageReward function accuracy, as mentioned in the paper. To do so, I downloaded the data, and modified it slightly so that it could be loaded using the script
make_dataset.py
. However, there are some file IDs in the training set, that have null images, that is 0K file size.Following is the list of the IDs.
005050-0024 005389-0008 005795-0038 006272-0041 006756-0071 005165-0028 005332-0172 005356-0019 006011-0030 006167-0087 006758-0099 005179-0097 005444-0063 005434-0068 005459-0003 005344-0055 006174-0048 006190-0114 006214-0021 006787-0015 006857-0073 006830-0003
Hello, I am also working on reproducing the training results, but I found the 'train.json' file in huggingface seems cannot be directly used for make_dataset.py. Could you share the processed train.json file? many thanks!
Hi,
I am trying to reproduce the 1k and 4k numbers for the ImageReward function accuracy, as mentioned in the paper. To do so, I downloaded the data, and modified it slightly so that it could be loaded using the script
make_dataset.py
. However, there are some file IDs in the training set, that have null images, that is 0K file size.Following is the list of the IDs.