buxiangzhiren / DDCap

MIT License
83 stars 11 forks source link

pkl文件相关问题 #5

Closed Leng-bingo closed 1 year ago

Leng-bingo commented 1 year ago

请问pkl每句话对应的图像的512维矩阵,是如何编码形成的,非常感谢!

buxiangzhiren commented 1 year ago

那个矩阵没有用在代码里面,可以不用管。

Leng-bingo commented 1 year ago

我想换成自己的数据集,这个pkl文件不需要生成么?

buxiangzhiren commented 1 year ago

需要,但是不需要里面512维的特征

Leng-bingo commented 1 year ago

那就是把每个图像的caption对应到自己数据集上,再把图像目录和对应json文件按照格式替换。

非常感谢,麻烦您了

buxiangzhiren commented 1 year ago

对的,不客气

Leng-bingo commented 1 year ago

我现在去试一试,感谢~

Leng-bingo commented 1 year ago

还有个小问题,您的json格式是下面这种,还是有带每个词语分词的那种

{"image_id": 133071,"id": 829693,"caption": "White Plate with a lot of guacamole and an extra large dollop of sour cream over meat"},{"image_id": 133071,"id": 829717,"caption": "A dinner plate has a lemon wedge garnishment."}
buxiangzhiren commented 1 year ago

还有一个可能的问题是,validation计算cider的时候,由于用的是coco那边的code,这里可能你还要改一下。或者你也可以自己写拿着生成的结果直接计算cider。

Leng-bingo commented 1 year ago

还没到那一步嘞(灬ꈍ ꈍ灬),先跑起来!

buxiangzhiren commented 1 year ago

还有个小问题,您的json格式是下面这种,还是有带每个词语分词的那种

{"image_id": 133071,"id": 829693,"caption": "White Plate with a lot of guacamole and an extra large dollop of sour cream over meat"},{"image_id": 133071,"id": 829717,"caption": "A dinner plate has a lemon wedge garnishment."}

就是你发的这种

Leng-bingo commented 1 year ago

感谢!祝您再发一篇CVPR!

rongtongxueya commented 1 year ago

感谢!祝您再发一篇CVPR! I met the same problem as you, but I don't know how to generate the pkl. I printed and saw that the pkl file has a clip_embedding more than the json file, I don't know how to get this embedding, can someone kind give me some advice?