dvlab-research LLaMA-VID issues

dvlab-research / LLaMA-VID

Official Implementation for LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models

Apache License 2.0

622 stars 39 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

why delay_load in build_vision_tower(config, delay_load=True)?

#47 dragen1860 closed 5 months ago
1
why different architectures in stage2 and stage3?

#46 dragen1860 closed 5 months ago
1
A question in stage3

#45 liziming5353 closed 5 months ago
2
stage 2: freezing the visual encoder?

#44 dragen1860 closed 5 months ago
1
two types of tokenizer?

#43 dragen1860 closed 5 months ago
3
you build `build_vision_tower` twice?

#42 dragen1860 closed 5 months ago
1
what does `lazy_preprocess` mean?

#41 dragen1860 closed 5 months ago
1
When can the Customed Long Video Gradio Web UI be released？

#40 QiSu77 closed 5 months ago
4
multiple json for training?

#39 dragen1860 closed 5 months ago
1
whether able to inference long video without using subtitles?

#38 Deaddawn closed 5 months ago
1
When running inference code on MSVD-QA,some error ocurred...

#37 CUCldyyyyy closed 5 months ago
3
LORA SUPPORTING

#36 Deaddawn closed 5 months ago
2
killing process when trying to train long video

#35 Deaddawn closed 6 months ago
4
May I ask if the format of the video file inferred from long videos is the format after feature extraction? Why do I always report errors and cannot recognize video information

#34 kunkunsheng closed 5 months ago
3
Questions about checkpoint loading

#33 zhangbw17 closed 6 months ago
2
An code error

#32 liziming5353 closed 6 months ago
4
我想在不训练的情况下，使用长视频推断，请问长数据的数据推理的格式是什么？

#31 kunkunsheng closed 5 months ago
3
Pretrain & Finetune Dataset like Webvid, COCO, GQA, etc.

#30 Hambaobao closed 4 months ago
10
OSError: It looks like the config file at './model_zoo/LAVIS/eva_vit_g.pth' is not a valid JSON file

#29 dragen1860 closed 6 months ago
2
AttributeError: 'Conversation' object has no attribute 'get_videos'

#28 zhangningboo closed 5 months ago
3
inference error

#27 liziming5353 closed 5 months ago
3
how to visualize the high response areas?

#26 erjiaxiao closed 6 months ago
10
Demo Crash

#25 QiSu77 closed 6 months ago
6
gradio wrong

#24 QiSu77 closed 6 months ago
3
Failed to run finetuning stage2 with ActivityNet videos

#23 XenonLamb closed 6 months ago
1
关于stage2的json文件

#22 liziming5353 closed 6 months ago
6
Hi Yanwei,This is my contact request

#21 MathewWuZJ closed 6 months ago
2
目前只能支持vicuna的llm吗，能否支持像baichuan2类似的中文llm

#20 starkgao closed 6 months ago
1
Question about error "mmco: unref short failure" in finetune stage.

#19 L4zyy closed 6 months ago
2
What is shot detection file for?

#18 Deaddawn closed 6 months ago
1
Can't Download shot detection results

#17 Deaddawn closed 6 months ago
3
There is no get_videos method in conversation.py

#16 zyghome closed 6 months ago
3
Question about Model's Understanding

#15 Journey7331 closed 6 months ago
4
Where to donwload MSVD-QA dataset for evaluation?

#14 Felix0805 closed 6 months ago
2
Question about Vision Encoder EVA-G

#13 FuryMartin closed 6 months ago
2
Traing cost

#12 ds22058 closed 6 months ago
2
ValueError: Not find vision tower: ./model_zoo/LAVIS/eva_vit_g.pth

#11 MasonLi closed 6 months ago
1
Demo problem

#10 2022yingjie closed 6 months ago
1
About preprocessing movienet

#9 wangkunyu241 closed 6 months ago
2
how to find this file?

#8 TXH-mercury closed 6 months ago
0
where is the gradio python file?

#7 Steven0706 closed 6 months ago
3
Cannot reproduce Zero-shot Video-QA (MSVD)

#6 dcahn12 closed 6 months ago
5
explain how the context attention is "optimized"

#5 iranroman closed 6 months ago
2
'variable' and 'tensor' have incompatible tensor type

#4 yiling-chen closed 6 months ago
4
Update README.md

#3 eltociear closed 7 months ago
0
Got it, Thank you

#2 gordonhu608 closed 7 months ago
0
./llamavid/processor/clip-patch14-224 not found

#1 wh0x closed 6 months ago
2