Gary-code / KECVQG

[ACM MM 2023] The released code of paper "Deconfounded Visual Question Generation with Causal Inference"
7 stars 0 forks source link

视觉confounder构建 #1

Open darwann opened 1 year ago

darwann commented 1 year ago

您好!我看您文章里面是用每种object的RoI特征均值当作视觉confounder,这里我想请教下您实验中对应COCO数据集中的Faster R-CNN特征是已经预训练直接拿来用的吗,还是需要自己重新训练一个Faster R-CNN模型去提取特征呀

Gary-code commented 1 year ago

We directly use the pre-trained Faster R-CNN features of the coco dataset to construct the confounder, which can be found in link

darwann commented 1 year ago

好的,谢谢您的回复!不好意思,麻烦再请教您下,这个链接里面好像也没有提供可以直接用的RoI特征,您是用他的给的预训练权重和脚本自己重新提取了COCO数据集的特征吗?还有就是您实验中对RoI特征求均值的这步是对提取出来的特征直接操作,还是将这些RoI特征映射到相同维度后再求均值啊。因为我刚刚才开始接触这部分工作,所以不是很了解,谢谢您了。

Gary-code commented 1 year ago

We use the pre-trained model in link to extract the visual features and all feature's dimension is 2048.

darwann commented 1 year ago

Thanks a lot. Will you disclose the data you used in your experiments, such as the vision confounder dictionary?