Gitsamshi / WeakVRD-Captioning

Implementation of paper "Improving Image Captioning with Better Use of Caption"
32 stars 7 forks source link

关于coco_cmb_vrg数据格式 #12

Closed wangwei8024 closed 3 years ago

wangwei8024 commented 3 years ago

您好!最近拜读了您的论文,请问coco_cmb_vrg文件夹中的数据,有三个属性,wrela、prela、obj分别代表什么啊?

Gitsamshi commented 3 years ago

Thank you very much for your interest. prela is from https://github.com/yangxuntu/SGAE, which is extracted by pre-trained vrd model. wrela is extracted from our weakly supervised vrd model. Both wrela and prela share the same obj information, therefore could be merged together.

wangwei8024 commented 3 years ago

您好!谢谢您的解答。还有一些问题希望您能解答一下,在代码中我没有看到论文中的公示9的实现。还有,对于object以及rela的节点表示,与论文中的也不完全一样,在经过映射函数之后,都加上了原先的特征,有点类似于残差的操作,这是基于什么考虑的呢?

Gitsamshi commented 3 years ago

1 公式9我这版代码里因为跑起来太慢了把那段精简掉了。写法:如果要batch化就仿照(s, o ,r)那里构建个(r_1, r_2, o)的edge matrix (分别代表in 和out edge),然后和relation node的conv一样;如果不batch化就循环在(s, o ,r)里算。因为obj节点的数量比较多,batch的比较费内存,不batch的比较费时间。 2 obj的intial feature把原先visual特征又加了一下,主要考虑是加强下visual feature的权重,比不加效果稍微好一点。rela的initial feature就是直接映射的。

wangwei8024 commented 3 years ago

您好!请问tag是怎样得到的啊,没看到相关代码

Gitsamshi commented 3 years ago

你好,用textual scene graph parser提取的(https://nlp.stanford.edu/software/scenegraph-parser.shtml)或者在https://github.com/yangxuntu/SGAE 提供的文件有提取好的。

wangwei8024 commented 3 years ago

谢谢您耐心的解答

wangwei8024 commented 3 years ago

您好!请问有处理得到的COCO评估服务器的测试数据集吗?

Gitsamshi commented 3 years ago

挺久之前的文件了,我这儿没找到备份,不好意思

LiHaoHN commented 3 years ago

@wangwei8024 您好,请问您最后获得COCO评估服务器的测试数据集的coco_cmb_vrg文件了没