ChenRocks / UNITER

Research code for ECCV 2020 paper "UNITER: UNiversal Image-TExt Representation Learning"
https://arxiv.org/abs/1909.11740
777 stars 109 forks source link

What is the gather index? #86

Open abhidipbhattacharyya opened 2 years ago

abhidipbhattacharyya commented 2 years ago

This is more of a question than issue.

In model.py at line number 330 gather index has been used to re-orient text and image embedding after concatenation. I am trying to understand what is this gather index conceptually. I am trying to find the same in the paper. From the paper it seems image and text features are concatenated. So it will be helpful to know what these gather index vectors represents and how to create them for a custom dataset.

Thanks.

2292384454 commented 2 years ago

Hello, I have the same confusion as you. I don’t know what is the role of torch.gather() after torch.cat([txt_emb, img_emb], dim=1). Have you found the answer to this question?