-
## 🐛 Bug
the VisualBert model ignores the output_attentions config
## Command
## To Reproduce
Steps to reproduce the behavior:
In a python script:
1. get a configuration with output_at…
-
Hi @ChenRocks @linjieli222,
It is still ok even if it is not used by [VisualBERT](https://github.com/uclanlp/visualbert/), [LXMERT](https://github.com/airsplay/lxmert), and [UNITER](https://github.c…
-
Thank you for your great repo. I am trying to create a colab version of a bunch of V+L models (lxmert, uniter, visualbert etc.). However, due to the RAM limit of colab. It is hard to read the entire h…
-
## ❓ Questions and Help
I would like to use the multimodal alignment task in VisualBERT, ViLBERT and MMBT.
According to this issue [1](https://github.com/facebookresearch/mmf/issues/466) this st…
-
Similar to https://demo.allennlp.org, it would be great to have online demo applications for various models available in gluonnlp.
List of demos from [AllenNLP](https://demo.allennlp.org/reading-c…
-
Hi authors,
Would it be possible to provide the pretrained MMBert (VisualBert) checkpoint used in the paper. I started trying to pre-train it on my own- but it is taking a very long time.
It would …
-
[paper](https://arxiv.org/pdf/2103.15679.pdf), [code](https://github.com/hila-chefer/Transformer-MM-Explainability)
## TL;DR
- **I read this because.. :** aka. CheferCAM. explainable CLIP scor…
-
I am trying to study the code in this repository. However, it is difficult to figure out the changes that have been made in sub folders of the Base Models (VisualBERT, LXMERT, DETR, etc) for this proj…
-
## Problem statement
1. CLIP variants의 이미지와 텍스트 사이의 관계 학습은 텍스트의 각 토큰들과 이미지 패치의 관계에 대해 학습하기에는 학습과 추론 시 효율성이 떨어진다 -> finer-level alignment할 수 있는 방법을 찾아보자
2. 이미지 패치와 텍스트 토큰 간의 attention 이용하는 기존 연구의 약점 …
-
Hi, thanks for releasing your code soon after your paper and for making your evaluations easy to reproduce! Could you please provide more detail on how you extracted the Detectron features? I don't se…