视觉confounder构建 - Githubissues

Gary-code / KECVQG

[ACM MM 2023] The released code of paper "Deconfounded Visual Question Generation with Causal Inference"

7 stars 0 forks source link

视觉confounder构建 #1

Open darwann opened 1 year ago

darwann commented 1 year ago

您好！我看您文章里面是用每种object的RoI特征均值当作视觉confounder，这里我想请教下您实验中对应COCO数据集中的Faster R-CNN特征是已经预训练直接拿来用的吗，还是需要自己重新训练一个Faster R-CNN模型去提取特征呀

Gary-code commented 1 year ago

We directly use the pre-trained Faster R-CNN features of the coco dataset to construct the confounder, which can be found in link

darwann commented 1 year ago

好的，谢谢您的回复！不好意思，麻烦再请教您下，这个链接里面好像也没有提供可以直接用的RoI特征，您是用他的给的预训练权重和脚本自己重新提取了COCO数据集的特征吗？还有就是您实验中对RoI特征求均值的这步是对提取出来的特征直接操作，还是将这些RoI特征映射到相同维度后再求均值啊。因为我刚刚才开始接触这部分工作，所以不是很了解，谢谢您了。

Gary-code commented 1 year ago

We use the pre-trained model in link to extract the visual features and all feature's dimension is 2048.

darwann commented 1 year ago

Thanks a lot. Will you disclose the data you used in your experiments, such as the vision confounder dictionary?