issues
search
X-PLUG
/
mPLUG-2
mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video (ICML 2023)
Apache License 2.0
212
stars
17
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Is fusion_encoder getting used for video captioning?
#23
Astuary
opened
1 week ago
0
Different results from paper
#22
lbx73737373
opened
3 weeks ago
0
Download link of ViT-L-14.tar
#21
zhiweibi
opened
1 month ago
1
pretrained_mplug_2.pth
#20
Anshalupmanyu
opened
2 months ago
0
Video_id
#19
Anshalupmanyu
closed
2 months ago
0
will the pretrain code be released?
#18
acdart
opened
4 months ago
0
same, chaos captions
#17
ZhangXiangYunfs
opened
6 months ago
0
Can't find "language_evaluation"
#16
wangyangchen
opened
6 months ago
2
Double usage of local temporal modeling
#15
andreadps
opened
6 months ago
0
Where can I find the universal layer module?
#14
Ta-Gu
closed
7 months ago
1
Localizing positions of objects in a scene
#13
rose-jinyang
opened
8 months ago
0
Evaluation Code?
#12
minuenergy
opened
10 months ago
0
Finetuned weights for MSRVTT text-video retrieval
#11
nguyenquangtan
opened
10 months ago
0
how to get train, valid, test
#10
minuenergy
opened
10 months ago
4
Does the inference need to run on 8 A100?
#9
simanw304
opened
11 months ago
1
mPLUG-2-base pre-training model weights
#8
LZ-CH
opened
11 months ago
0
Is there a way to adapt the model to cartoon video clips?
#7
aartykov
opened
11 months ago
0
Is the zero-shot performance of VideoQA in Table 20 from the model finetuned on ImageQA dataset?
#6
tgyy1995
opened
11 months ago
1
Code for Video Text Retrieval
#5
Hritikbansal
closed
11 months ago
1
How to perform CIDEr optimization?
#4
KaiGod0730
closed
11 months ago
3
Could you provide the JSON file in the video captioning task?
#3
myccver
closed
11 months ago
4
When are you gonna share the weights?
#2
aartykov
closed
11 months ago
4
Code request
#1
GFZShiwai
closed
12 months ago
4