-
According to the paper and readme, LXMERT uses VG Caption for pre-training. But I only found region descriptions in Visual Genome instead of whole image captions. So what kind of caption does the mode…
-
I thought COCO dataset has 5 captions per image.
But *.json files show that some images have more than 5 captions. Is this normal?
![image](https://user-images.githubusercontent.com/18069263/82767…
j-min updated
4 years ago
-
Hi, Thanks for the great repo.
I want to get an understanding of what happened during the pretraining, especially the train and validation loss change.
In other work like lxmert and my own exp…
-
when I set --train train --valid valid for python src/tasks/gqa.py, I find that the accuracy for valid set is 80%. what wrong with this setting and how to evaluate the result on the valid set. Thank y…
-
Just want to confirm, when you talk about "pre-training" in the readme (https://github.com/airsplay/lxmert#pre-training) you mean training the entire LXMERT model from scratch?
If we just want to …
-
In this code snippet of file lxrt.modelling.py https://github.com/airsplay/lxmert/blob/9b8f0ffd56dba5490f37af9c14f3c112cc10358c/src/lxrt/modeling.py#L940-L953
masked_lm_loss is still being optimized…
-
# 🌟New model addition
## Model description
[VL-BERT: PRE-TRAINING OF GENERIC VISUALLINGUISTIC REPRESENTATIONS](https://arxiv.org/pdf/1908.08530.pdf)
> *We introduce a new pre-trainable generi…
-
This is not urgent. There is a ton of deprecation warnings across many modules with pytorch-1.7+ and a few with python-3.8:
(I hard-wrapped the lines to avoid the need to scroll, but it makes somewha…
-
Hi and thank you for sharing your code in such an actually usable way !
The pretraining instructions mention a file name 'all_ans.json', which is required for launching the pretraining instruction, b…
-
Thanks for the work, could you also provide the fine-tuned model for VQA, GQA and NLVR2? That would be very helpful, thanks.