salesforce ALBEF issues

salesforce / ALBEF

Code for ALBEF: a new vision-language pre-training method

BSD 3-Clause "New" or "Revised" License

1.57k stars 199 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Missing text-only Transformer in visualization notebook

#95 BennoKrojer opened 2 years ago
4
Questions about the results of NLVR2

#94 junyubi opened 2 years ago
0
RuntimeError: invalid multinomial distribution (sum of probabilities <= 0)

#93 yirutsai opened 2 years ago
5
Training behavior on 16 gpus

#92 zhaohengz opened 2 years ago
2
Question about VQA answer tokenizer

#91 katrina433 closed 2 years ago
2
About VQA mlm loss

#90 katrina433 closed 2 years ago
1
Zero-shot evaluation results about Pretrain models

#89 wjxzju opened 2 years ago
1
Only inference not finetune using VQA

#88 SKBL5694 opened 2 years ago
0
About generating VQA test result files

#87 yang178908 opened 2 years ago
2
About answer_ List modification

#86 Xjmengnieer opened 2 years ago
1
Questions about the epochs of pre-training ?

#85 Fly2flies closed 2 years ago
2
Questions about visual grounding

#84 jaeseokbyun opened 2 years ago
2
Whole sentence visualization in Fig.4 and Fig10

#83 haoshuai714 closed 2 years ago
4
Pre-training Image-Text Matching

#82 xiezhiweihk closed 2 years ago
1
Pretraining epochs

#81 haoshuai714 closed 2 years ago
0
About license

#80 WangWenhao0716 closed 2 years ago
2
IncompatibleKeys

#79 mzolfaghari opened 2 years ago
2
Finetune for image-text retrieval task on COCO

#78 yxoh opened 2 years ago
1
Pre-training on custom datasets

#77 xiezhiweihk closed 2 years ago
2
Question about initialising Bert while pretraining

#76 MiraclesinWang closed 2 years ago
3
Why get_special_tokens_mask appending a [1] at the end while build_inputs_with_special_tokens does not append a [SEP] at the end for a single input sequence ?

#75 zhihuacc opened 2 years ago
3
Image-Text Retrieval Task, ITC score for ranking

#74 yxoh closed 2 years ago
6
RefCOCO+ Result

#73 haoshuai714 closed 2 years ago
2
Caption Dataset Class Bug

#72 nehad-procogia opened 2 years ago
0
Result worse than it in the paper

#71 liuuzexiang closed 2 years ago
10
VQA dataset keys don't match code

#70 shankyemcee opened 2 years ago
4
Test On Visual Grounding on RefCOCO+ Task

#69 haoshuai714 closed 2 years ago
2
segmentation layer

#68 chaochen99 opened 2 years ago
1
Question about nlvr checkpoint

#67 g25h1 closed 2 years ago
2
GPU memory rises continuously

#66 yuanrr opened 2 years ago
1
Apex

#65 dw-dengwei closed 2 years ago
1
obout some json files

#64 TungWg closed 2 years ago
4
Add Docker environment & web demo

#63 chenxwh closed 2 years ago
1
Retrieval result varies on multi-gpu distributed training

#62 averyma opened 2 years ago
1
Which layers of BERT are used for MLM Loss?

#61 rakeshchada opened 2 years ago
3
How much memory is needed for pre-training

#60 2292384454 opened 2 years ago
2
grounding results

#59 ziyanyang closed 2 years ago
2
about MLM softlabel implementation

#58 zhezh opened 2 years ago
3
How to get VQA result?

#57 haoshuai714 closed 2 years ago
3
Visual Entailment dataset?

#56 haoshuai714 closed 2 years ago
1
4M Pre-trained checkpoint

#55 haoshuai714 closed 2 years ago
2
grad-cam code

#54 ziyanyang closed 2 years ago
2
question about ITM loss

#53 Qiulin-W closed 2 years ago
7
what is the reasonable magnitude of mlm_loss after pretraining?

#52 Qiulin-W closed 2 years ago
2
Visual Grounding, Whole sentence visualization in Fig.4

#51 zzzzzigzag closed 2 years ago
2
Questions about Visual Grounding checkpoint and visualization

#50 zzzzzigzag closed 2 years ago
5
pretrain dataset raw images

#49 lireagan closed 2 years ago
1
test in VQA dataset

#48 haoshuai714 closed 2 years ago
7
Pre-trained checkpoint

#47 haoshuai714 closed 2 years ago
2
NCCL problems of pretrain

#46 Junjie-Ye closed 2 years ago
3

Previous Next