buxiangzhiren / DDCap

MIT License
84 stars 11 forks source link

更换数据集问题 #3

Closed Leng-bingo closed 1 year ago

Leng-bingo commented 1 year ago

如果想更换自己的其他数据集,只需要对应以下格式即可么。

│MSCOCO_Caption/
├──annotations/
│  ├── captions_train2014.json
│  ├── captions_val2014.json
├──train2014/
│  ├── COCO_train2014_000000000009.jpg
│  ├── ......
├──val2014/ 
│  ├── COCO_val2014_000000000042.jpg
│  ├── ......
Leng-bingo commented 1 year ago

还有就是下面的json文件切分好的哪里可以下载呢,里面的格式是什么样子的,再次感谢

├──annotations/
│  ├── captions_train2014.json
│  ├── captions_val2014.json
buxiangzhiren commented 1 year ago

你需要自己生成.pkl文件,只要是image和text一对的那种就行。然后json文件你可以自己用加载了看一下

buxiangzhiren commented 1 year ago

json文件是coco数据集自己提供的

rongtongxueya commented 1 year ago

hi,dude.I met the same problem as this but I don't know how to generate the .pkl. I printed and saw that the .pkl file has a clip_embedding more than the json file, I don't know how to get this embedding, can someone kind give me some advice