issues
search
guilk
/
VLC
Research code for "Training Vision-Language Transformers from Captions Alone"
33
stars
4
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Request for Pre-trained Model Weights
#8
AravindSMysore
opened
1 week ago
0
VQA trained model weight available?
#7
LiuJoffrey
closed
1 year ago
1
What is the image resolution for VQA finetuning 384 x 384 like the pretraining?
#6
sanyalsunny111
closed
2 years ago
3
Thank you for your code! If the pre-trained checkpoint of bert embeding is avaiable?
#5
senmaoy
closed
2 years ago
1
imagenet finetuned
#4
tankche1
closed
2 years ago
23
The code of lexical-patch alignment visualization
#3
jiyt17
closed
2 years ago
2
The accuracy of downstream tasks
#2
jiyt17
closed
2 years ago
7
I am looking forward to It!
#1
pqh-zjut
closed
2 years ago
1