Closed ratthachat closed 4 years ago
Hi! Thank you. Q: what's main difference or improvement over your ICCV2019 work? A: They are quite different. The ICCV work introduces novelty in the text encoder, in the loss function, and image encoder. That loss function makes the early stage of the training allowing for training with multiple languages even in character-level.
The AAAI work (namely, ADAPT, that is coded in this repo) uses a default text encoder, and brings novelty in the way of computing the similarity matrix (cosine for all possible image-caption pairs). We use a vector from the sentences to filter vectors that represent image regions, in a top-down manner. By doing so, the same image can be represented in several distinct ways depending on the textual query provided.
Q: Do they have similar performance? A: ADAPT work provides much higher gains in predictive performance. Currently, I am investigating the effect of ADAPT in a multilingual setting for my thesis.
Thanks so much! I had a chance to attend your poster at ICCV Korea and happy to see your next works!
Hi Jonatas,
Great work! I am wondering, beside multi-lingual, what's main difference or improvement over your ICCV2019 work?
Do they have similar performance?