kuanghuei / SCAN

PyTorch source code for "Stacked Cross Attention for Image-Text Matching" (ECCV 2018)
Apache License 2.0
546 stars 113 forks source link

Questions about the fusion model #62

Open lvshuai789 opened 11 months ago

lvshuai789 commented 11 months ago

Hi, kualee: I am a doctoral candidate in Beijing University of Posts and Telecommunications. I am also currently working on cross-modal graphic retrieval directions and it is my great honor to read the paper SCAN. I have some questions of my own to ask you,I wonder if it is convenient for you.First, for the experimental results,as shown in the picture below:

image My understanding is: The I2T-trained model can deduce i2t and t2i indexes on f30k respectively Then the T2I-trained model can also deduce the i2t and t2i indexes separately on f30k Where did i2t + t2i come from? image (1)

SUZILI7 commented 2 months ago

解决了吗,我也很困惑