issues
search
huangb23
/
VTimeLLM
[CVPR'2024 Highlight] Official PyTorch implementation of the paper "VTimeLLM: Empower LLM to Grasp Video Moments".
https://arxiv.org/pdf/2311.18445.pdf
Other
226
stars
11
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Repeated Outputs Issue with Shot2Story Dataset Stage3 Fine-tuning
#43
katie312
opened
4 days ago
0
关于指定transformers版本为4.31.0的相关问题
#42
yourssmile
opened
1 week ago
1
Regarding the Second Phase of Training for the VTimeLLM Model
#41
jiyi-zyh
opened
3 weeks ago
1
KeyError: 'VTimeLLMConfig'
#40
zipMunk
closed
1 month ago
0
请问llm用的是base模型还是chat模型
#39
1SingleFeng
closed
2 months ago
1
About second training stage
#38
jjy961228
opened
2 months ago
0
About for DiDeMo dataset
#37
lixuefenfen
opened
2 months ago
1
Link to the training data not available
#36
williamium3000
opened
2 months ago
0
About Video Feature Project
#35
xiaokj37
opened
3 months ago
1
About training.
#34
EdenGabriel
closed
2 months ago
3
RuntimeError: The size of tensor a (147) must match the size of tensor b (293) at non-singleton dimension 3
#33
Vijaysivadas
opened
4 months ago
1
Code to generate Stage 3 dataset from ActivityNet or DiDeMo
#32
anilbatra2185
opened
4 months ago
2
stage2 features
#31
simplewhite9
closed
5 months ago
2
Feature extraction code requirement
#30
L4zyy
closed
2 months ago
1
About Activitynet eval process
#29
lixuefenfen
closed
2 months ago
3
About evaluation
#28
EdenGabriel
opened
6 months ago
0
Id corrspondence
#27
wayne3771
closed
6 months ago
2
Low accuracy rate
#26
wayne3771
closed
6 months ago
5
How much time does it take to extract features in stage 2 and what is the hardware used?
#25
Maulog
closed
5 months ago
4
NVIDIA Driver error - could not parse ModelProto
#24
simran0112
closed
6 months ago
2
Why did you use the only subset?
#23
MSungK
closed
6 months ago
1
Missing intern_clip_feat
#22
Tanveer81
opened
7 months ago
2
Linking id to DiDeMo video path
#21
ZhangYuanhan-AI
closed
7 months ago
3
Training Warning
#20
Tanveer81
closed
7 months ago
1
About lora duplication
#19
yeppp27
opened
7 months ago
6
You are using a model of type llama to instantiate a model of type VTimeLLM. This is not supported for all configurations of models and can yield errors. ?
#18
dragen1860
closed
7 months ago
1
can I simply query the model to locate the `highlight moment or the best moment` in the video?
#17
dragen1860
closed
7 months ago
2
Moment Localization Evaluation
#16
Tanveer81
closed
8 months ago
0
Are you working on exposing an inference endpoint on huggingface or replicate?
#15
nwaughachukwuma
closed
10 months ago
1
Running VTimeLLM inference Offline
#14
dengandong
closed
10 months ago
0
Main differences between VTimeLLM and LLaVA
#13
itruonghai
closed
10 months ago
1
chatglm3的中文理解能力怎么样?
#12
lucasjinreal
closed
10 months ago
3
RuntimeError: cu_seqlens_q must have shape (batch_size + 1)
#11
KlayMa527
closed
10 months ago
1
Tokenization mismatch
#10
weiyuan-c
closed
10 months ago
6
Will the test data and code in the paper be released?
#9
hlz0606
closed
10 months ago
1
13B model ?
#8
vhzy
closed
10 months ago
1
Add gradio demo
#7
Rishubi
closed
11 months ago
0
Add gradio demo
#6
Rishubi
closed
11 months ago
0
Update builder.py
#5
Rishubi
closed
11 months ago
0
InternVID training dataset
#4
LengSicong
closed
11 months ago
2
question about missing features?
#3
vhzy
closed
11 months ago
1
Update data.md
#2
Rishubi
closed
11 months ago
0
when training available?
#1
vhzy
closed
11 months ago
2