关于coco_cmb_vrg数据格式

Gitsamshi / WeakVRD-Captioning

Implementation of paper "Improving Image Captioning with Better Use of Caption"

32 stars 7 forks source link

关于coco_cmb_vrg数据格式 #12

Closed wangwei8024 closed 3 years ago

wangwei8024 commented 3 years ago

您好！最近拜读了您的论文，请问coco_cmb_vrg文件夹中的数据，有三个属性，wrela、prela、obj分别代表什么啊？

Gitsamshi commented 3 years ago

Thank you very much for your interest. prela is from https://github.com/yangxuntu/SGAE, which is extracted by pre-trained vrd model. wrela is extracted from our weakly supervised vrd model. Both wrela and prela share the same obj information, therefore could be merged together.

wangwei8024 commented 3 years ago

您好！谢谢您的解答。还有一些问题希望您能解答一下，在代码中我没有看到论文中的公示9的实现。还有，对于object以及rela的节点表示，与论文中的也不完全一样，在经过映射函数之后，都加上了原先的特征，有点类似于残差的操作，这是基于什么考虑的呢？

Gitsamshi commented 3 years ago

1 公式9我这版代码里因为跑起来太慢了把那段精简掉了。写法：如果要batch化就仿照（s, o ,r）那里构建个(r_1, r_2, o)的edge matrix （分别代表in 和out edge），然后和relation node的conv一样；如果不batch化就循环在（s, o ,r）里算。因为obj节点的数量比较多，batch的比较费内存，不batch的比较费时间。 2 obj的intial feature把原先visual特征又加了一下，主要考虑是加强下visual feature的权重，比不加效果稍微好一点。rela的initial feature就是直接映射的。

wangwei8024 commented 3 years ago

您好！请问tag是怎样得到的啊，没看到相关代码

Gitsamshi commented 3 years ago

你好，用textual scene graph parser提取的（https://nlp.stanford.edu/software/scenegraph-parser.shtml）或者在https://github.com/yangxuntu/SGAE 提供的文件有提取好的。

wangwei8024 commented 3 years ago

谢谢您耐心的解答

wangwei8024 commented 3 years ago

您好！请问有处理得到的COCO评估服务器的测试数据集吗？

Gitsamshi commented 3 years ago

挺久之前的文件了，我这儿没找到备份，不好意思

LiHaoHN commented 3 years ago

@wangwei8024 您好，请问您最后获得COCO评估服务器的测试数据集的coco_cmb_vrg文件了没