Closed Espere-1119-Song closed 1 month ago
It is a good question.
I have provided a json file in google drive, and the example of this file is: ( 'image_path' is key and 'url' is value in a dictionary)
https://drive.google.com/file/d/1iRaYLxrW_pHODzMpvIvjSNfpsM_9OGiL/view?usp=sharing
Moreover, we will release CC12M in this week~
Best Kecheng
Thank you for your great work. I tried to used your google drive link but it shows me below. Can you share the valid link for "cc3m_path2url.json" again?
We have updated the CC3M&12M files as follows: 3 types of short captions have been added, and we replace 'path' to image ‘url’ in these csv files.
CC3M: https://drive.google.com/file/d/1RPcFS8jrVolA9RzHXD581E8BxR7jYDap/view?usp=sharing CC12M: https://drive.google.com/file/d/12iUhceznPNWd-l_bGSF5rSnzdruP4Jtr/view?usp=sharing
Hi, thanks for your great contribution of the dataset. When I download cc3m, the index of my images seems different from yours.
for example, when point to '0000000/0000008.jpg', the caption of mine is "# of the sports team skates against sports team during their game .", while the 'raw_caption' of yours is ''modern luxe has a very simple look , and offers a bold monogram of the couple 's initials .". I download CC3M via img2dataset.
I think providing another file with 'image_path' and its respective 'url' is a potential solution. Can you provide it? Thanks!!