zjunlp / MKGformer

[SIGIR 2022] Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion
MIT License
168 stars 28 forks source link

关于pixel_images, aux_images, rcnn_images分别代表什么呢 #23

Closed ririv closed 1 year ago

ririv commented 1 year ago

为什么要把他们三者分别作为一种feature呢

flow3rdown commented 1 year ago

您好,三种images代表了不同粒度的图像信息,其中,pixel_images是全图,aux_images是经过visual grounding后提取的子图,rcnn_image是使用RCNN进行目标检测提取的object图片。