huangb23 VTimeLLM issues

huangb23 / VTimeLLM

[CVPR'2024 Highlight] Official PyTorch implementation of the paper "VTimeLLM: Empower LLM to Grasp Video Moments".

https://arxiv.org/pdf/2311.18445.pdf

Other

226 stars 11 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Repeated Outputs Issue with Shot2Story Dataset Stage3 Fine-tuning

#43 katie312 opened 4 days ago
0
关于指定transformers版本为4.31.0的相关问题

#42 yourssmile opened 1 week ago
1
Regarding the Second Phase of Training for the VTimeLLM Model

#41 jiyi-zyh opened 3 weeks ago
1
KeyError: 'VTimeLLMConfig'

#40 zipMunk closed 1 month ago
0
请问llm用的是base模型还是chat模型

#39 1SingleFeng closed 2 months ago
1
About second training stage

#38 jjy961228 opened 2 months ago
0
About for DiDeMo dataset

#37 lixuefenfen opened 2 months ago
1
Link to the training data not available

#36 williamium3000 opened 2 months ago
0
About Video Feature Project

#35 xiaokj37 opened 3 months ago
1
About training.

#34 EdenGabriel closed 2 months ago
3
RuntimeError: The size of tensor a (147) must match the size of tensor b (293) at non-singleton dimension 3

#33 Vijaysivadas opened 4 months ago
1
Code to generate Stage 3 dataset from ActivityNet or DiDeMo

#32 anilbatra2185 opened 4 months ago
2
stage2 features

#31 simplewhite9 closed 5 months ago
2
Feature extraction code requirement

#30 L4zyy closed 2 months ago
1
About Activitynet eval process

#29 lixuefenfen closed 2 months ago
3
About evaluation

#28 EdenGabriel opened 6 months ago
0
Id corrspondence

#27 wayne3771 closed 6 months ago
2
Low accuracy rate

#26 wayne3771 closed 6 months ago
5
How much time does it take to extract features in stage 2 and what is the hardware used?

#25 Maulog closed 5 months ago
4
NVIDIA Driver error - could not parse ModelProto

#24 simran0112 closed 6 months ago
2
Why did you use the only subset?

#23 MSungK closed 6 months ago
1
Missing intern_clip_feat

#22 Tanveer81 opened 7 months ago
2
Linking id to DiDeMo video path

#21 ZhangYuanhan-AI closed 7 months ago
3
Training Warning

#20 Tanveer81 closed 7 months ago
1
About lora duplication

#19 yeppp27 opened 7 months ago
6
You are using a model of type llama to instantiate a model of type VTimeLLM. This is not supported for all configurations of models and can yield errors. ?

#18 dragen1860 closed 7 months ago
1
can I simply query the model to locate the `highlight moment or the best moment` in the video?

#17 dragen1860 closed 7 months ago
2
Moment Localization Evaluation

#16 Tanveer81 closed 8 months ago
0
Are you working on exposing an inference endpoint on huggingface or replicate?

#15 nwaughachukwuma closed 10 months ago
1
Running VTimeLLM inference Offline

#14 dengandong closed 10 months ago
0
Main differences between VTimeLLM and LLaVA

#13 itruonghai closed 10 months ago
1
chatglm3的中文理解能力怎么样？

#12 lucasjinreal closed 10 months ago
3
RuntimeError: cu_seqlens_q must have shape (batch_size + 1)

#11 KlayMa527 closed 10 months ago
1
Tokenization mismatch

#10 weiyuan-c closed 10 months ago
6
Will the test data and code in the paper be released?

#9 hlz0606 closed 10 months ago
1
13B model ？

#8 vhzy closed 10 months ago
1
Add gradio demo

#7 Rishubi closed 11 months ago
0
Add gradio demo

#6 Rishubi closed 11 months ago
0
Update builder.py

#5 Rishubi closed 11 months ago
0
InternVID training dataset

#4 LengSicong closed 11 months ago
2
question about missing features?

#3 vhzy closed 11 months ago
1
Update data.md

#2 Rishubi closed 11 months ago
0
when training available?

#1 vhzy closed 11 months ago
2