phellonchen / X-LLM

X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages
https://x-llm.github.io
Apache License 2.0
304 stars 17 forks source link

how to link the image id in DATA JSON to the image in IMAGE URLS for WuKong #11

Open ghost opened 1 year ago

ghost commented 1 year ago

hi, I have downloaded the wukong data from the url provided in https://github.com/phellonchen/X-LLM/blob/main/README_DATA.md, the order of samples in CSV files is not consistent with the image id/name in JSON file, so how can l link them between original image urls and filtered image names? @MingLunHan @phellonchen

rumusan commented 1 year ago

same question for cc3m

phellonchen commented 1 year ago

For Wukong dataset, we filtered the first 50 million images using Chinese-CLIP (Vit-B-16 model) and only kept samples with a visual-textual similarity score greater than 0.475. So, you will need to pair the captions with the corresponding images based on the image captions.

For CC3M, we will try to restore their original correspondence.