issues
search
dvlab-research
/
LLaMA-VID
Official Implementation for LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models
Apache License 2.0
622
stars
39
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
why delay_load in build_vision_tower(config, delay_load=True)?
#47
dragen1860
closed
5 months ago
1
why different architectures in stage2 and stage3?
#46
dragen1860
closed
5 months ago
1
A question in stage3
#45
liziming5353
closed
5 months ago
2
stage 2: freezing the visual encoder?
#44
dragen1860
closed
5 months ago
1
two types of tokenizer?
#43
dragen1860
closed
5 months ago
3
you build `build_vision_tower` twice?
#42
dragen1860
closed
5 months ago
1
what does `lazy_preprocess` mean?
#41
dragen1860
closed
5 months ago
1
When can the Customed Long Video Gradio Web UI be released?
#40
QiSu77
closed
5 months ago
4
multiple json for training?
#39
dragen1860
closed
5 months ago
1
whether able to inference long video without using subtitles?
#38
Deaddawn
closed
5 months ago
1
When running inference code on MSVD-QA,some error ocurred...
#37
CUCldyyyyy
closed
5 months ago
3
LORA SUPPORTING
#36
Deaddawn
closed
5 months ago
2
killing process when trying to train long video
#35
Deaddawn
closed
6 months ago
4
May I ask if the format of the video file inferred from long videos is the format after feature extraction? Why do I always report errors and cannot recognize video information
#34
kunkunsheng
closed
5 months ago
3
Questions about checkpoint loading
#33
zhangbw17
closed
6 months ago
2
An code error
#32
liziming5353
closed
6 months ago
4
我想在不训练的情况下,使用长视频推断,请问长数据的数据推理的格式是什么?
#31
kunkunsheng
closed
5 months ago
3
Pretrain & Finetune Dataset like Webvid, COCO, GQA, etc.
#30
Hambaobao
closed
4 months ago
10
OSError: It looks like the config file at './model_zoo/LAVIS/eva_vit_g.pth' is not a valid JSON file
#29
dragen1860
closed
6 months ago
2
AttributeError: 'Conversation' object has no attribute 'get_videos'
#28
zhangningboo
closed
5 months ago
3
inference error
#27
liziming5353
closed
5 months ago
3
how to visualize the high response areas?
#26
erjiaxiao
closed
6 months ago
10
Demo Crash
#25
QiSu77
closed
6 months ago
6
gradio wrong
#24
QiSu77
closed
6 months ago
3
Failed to run finetuning stage2 with ActivityNet videos
#23
XenonLamb
closed
6 months ago
1
关于stage2的json文件
#22
liziming5353
closed
6 months ago
6
Hi Yanwei,This is my contact request
#21
MathewWuZJ
closed
6 months ago
2
目前只能支持vicuna的llm吗,能否支持像baichuan2类似的中文llm
#20
starkgao
closed
6 months ago
1
Question about error "mmco: unref short failure" in finetune stage.
#19
L4zyy
closed
6 months ago
2
What is shot detection file for?
#18
Deaddawn
closed
6 months ago
1
Can't Download shot detection results
#17
Deaddawn
closed
6 months ago
3
There is no get_videos method in conversation.py
#16
zyghome
closed
6 months ago
3
Question about Model's Understanding
#15
Journey7331
closed
6 months ago
4
Where to donwload MSVD-QA dataset for evaluation?
#14
Felix0805
closed
6 months ago
2
Question about Vision Encoder EVA-G
#13
FuryMartin
closed
6 months ago
2
Traing cost
#12
ds22058
closed
6 months ago
2
ValueError: Not find vision tower: ./model_zoo/LAVIS/eva_vit_g.pth
#11
MasonLi
closed
6 months ago
1
Demo problem
#10
2022yingjie
closed
6 months ago
1
About preprocessing movienet
#9
wangkunyu241
closed
6 months ago
2
how to find this file?
#8
TXH-mercury
closed
6 months ago
0
where is the gradio python file?
#7
Steven0706
closed
6 months ago
3
Cannot reproduce Zero-shot Video-QA (MSVD)
#6
dcahn12
closed
6 months ago
5
explain how the context attention is "optimized"
#5
iranroman
closed
6 months ago
2
'variable' and 'tensor' have incompatible tensor type
#4
yiling-chen
closed
6 months ago
4
Update README.md
#3
eltociear
closed
7 months ago
0
Got it, Thank you
#2
gordonhu608
closed
7 months ago
0
./llamavid/processor/clip-patch14-224 not found
#1
wh0x
closed
6 months ago
2
Previous