Wangt-CN / MTFN-RR-PyTorch-Code

The offical code for paper "Matching Images and Text with Multi-modal Tensor Fusion and Re-ranking", ACM Multimedia 2019 Oral
66 stars 18 forks source link

Is your code available on the t2i rerank? #2

Open kywen1119 opened 5 years ago

liuyyy111 commented 3 years ago

in my case, I got an unbelievable result after using the t2i rerank, did you have same problem?

Wangt-CN commented 3 years ago

Hi, the T2i rerank needs to train the text-text fusion network firstly to get the similarity between sentences. BTW, what the "unbelievable results" represent? Actually, the Rerank performance just depends on the similarity measurement. Therefore, the i2t rerank can almost achieve higher Recall. And the t2i rerank performance also depends on the performance of text-text similarity measurement.

liuyyy111 commented 3 years ago

thank you for your prompt reply. The "unbelievable results" means that I got the result of t2i is almost slightly better than i2t's, when I use a bigger K, like 100. and you mean that only models that implement text-text fusion network can use t2i re-rank? can it work on SCAN?

Wangt-CN commented 3 years ago
  1. The t2i results is slightly better than i2t : I am not sure about this actually, since the i2t and t2i results don't have very close relation.

  2. Because of the dataset (1 image - 5 sentences) . Therefore, for the i2t rerank, you needn't text-text similarity. But if you directly use the t2i rerank, the performance may have very slight improvement. Because the sentences are 5 times more than images, which mean the search back (from image to text) may not be the text you used (any sentence of the 5 sentences can all be reasonable). Therefore it's better to use a text-text network to get the text-text similarity. And then the search back to any similar sentence can be ok. This operation can improve the t2i rerank performance. The details you can refer to the paper.

  3. It can work on SCAN of course, but I don't have a try actually. The rerank operation is a universal algorithm for refine the results. The i2t rerank can directly be used. But to get better t2i rerank, you'd better get the sentence-sentence similarity.

liuyyy111 commented 3 years ago

I just use t2i re-rank directly without text-text fusion network, but the result improved so much. that is my result on t2i: original: 52.94 79.7 86.7 1.0 9.9886 t2i re-rank: 77.16 90.8 91.54 1.0 8.615 and i use another model which is SCAN-ti2 with my own data, and it also improve a lot too. i2t original 63.8 88.7 94.6 1.0 3.657 i2t rerank 70.4 91.4 95.9 1.0 3.261 t2i original 49.16 76.46 84.66 2.0 10.6058 t2i rerank 70.32 92.76 96.72 1.0 6.4208

the result has improved a lot, but you said "very slight improvement", why would this happend?

Wangt-CN commented 3 years ago

Hi, the i2t rerank is exactly right as I just said.

For the t2i rerank (which is commented in the 'rerank.py' now), if you directly run, it will use the ground truth sentences cluster, which means the t2i results you get now is the upper bound of the rerank performance on t2i. The ground truth sentences cluster also denotes that the similarity measurement is perfect. ( I have just go through the rerank code quickly since this code is too old for me. And the reason should be like what I said. )

liuyyy111 commented 3 years ago

thank you