我想问一下这里的数据集从哪里下载呢?我关注到论文里有说“We filtered the samples in these datasets based on image resolution, aspect ratio, and visual-textual similarity. We randomly place images or text at the forefront, in order to achieve the generation of captions based on images and vice versa.”
如果可以的话,是否可以开源训练数据呢?非常感谢!
您好,感谢您的开源和杰出的工作!我想问一下在SEED/MultiModalLLM/configs/data/caption_torchdata_preprocess.yaml中 data_dir:
我想问一下这里的数据集从哪里下载呢?我关注到论文里有说“We filtered the samples in these datasets based on image resolution, aspect ratio, and visual-textual similarity. We randomly place images or text at the forefront, in order to achieve the generation of captions based on images and vice versa.” 如果可以的话,是否可以开源训练数据呢?非常感谢!