Closed Leng-bingo closed 1 year ago
那个矩阵没有用在代码里面,可以不用管。
我想换成自己的数据集,这个pkl文件不需要生成么?
需要,但是不需要里面512维的特征
那就是把每个图像的caption对应到自己数据集上,再把图像目录和对应json文件按照格式替换。
非常感谢,麻烦您了
对的,不客气
我现在去试一试,感谢~
还有个小问题,您的json格式是下面这种,还是有带每个词语分词的那种
{"image_id": 133071,"id": 829693,"caption": "White Plate with a lot of guacamole and an extra large dollop of sour cream over meat"},{"image_id": 133071,"id": 829717,"caption": "A dinner plate has a lemon wedge garnishment."}
还有一个可能的问题是,validation计算cider的时候,由于用的是coco那边的code,这里可能你还要改一下。或者你也可以自己写拿着生成的结果直接计算cider。
还没到那一步嘞(灬ꈍ ꈍ灬),先跑起来!
还有个小问题,您的json格式是下面这种,还是有带每个词语分词的那种
{"image_id": 133071,"id": 829693,"caption": "White Plate with a lot of guacamole and an extra large dollop of sour cream over meat"},{"image_id": 133071,"id": 829717,"caption": "A dinner plate has a lemon wedge garnishment."}
就是你发的这种
感谢!祝您再发一篇CVPR!
感谢!祝您再发一篇CVPR! I met the same problem as you, but I don't know how to generate the pkl. I printed and saw that the pkl file has a clip_embedding more than the json file, I don't know how to get this embedding, can someone kind give me some advice?
请问pkl每句话对应的图像的512维矩阵,是如何编码形成的,非常感谢!