issues
search
RenShuhuai-Andy
/
TimeChat
[CVPR 2024] TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding
https://arxiv.org/abs/2312.02051
BSD 3-Clause "New" or "Revised" License
250
stars
21
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Question about the Text Input to the LLM
#42
ShramanPramanick
closed
3 days ago
2
不能回答中文
#41
wublubdubdaxml
closed
3 weeks ago
1
Ask for reproducing
#40
HYOJINPARK
opened
3 weeks ago
5
Data type not aligned
#39
KKKLeon
opened
1 month ago
1
Long video test results did not meet expectations
#38
ffiioonnaa
closed
3 weeks ago
2
Discussion : Steps for swapping Llama 2 with Llama 3
#37
rahulkrprajapati
closed
3 weeks ago
1
Weight for QA benchmarks
#36
NIneeeeeem
closed
1 month ago
3
Asking for the Fine-tuned Checkpoint
#35
minjoong507
closed
1 month ago
2
Could you test TimeChat on the EgoShema dataset?
#34
EricLina
closed
1 month ago
2
Why the result of temporal video grounding is always the multiple of 5?
#33
zhengrongz
closed
2 months ago
5
用自己的数据集finetune,如何在train的过程中进行eval?
#32
changqinyao
opened
2 months ago
1
Based transformers version needed for modifying models/modeling_llama.py
#31
yeahjack
opened
2 months ago
3
Questions about the provided fine-tuning model parameters
#30
LanXingXuan
closed
2 months ago
1
Inference with audio
#29
lakshya-frontera
closed
3 weeks ago
2
Question about the output of the time-aware frame encoder
#28
Mingxiao-Li
closed
2 months ago
2
When conducting SFT experiments, setting batch_size_train to 1 or 2 has the same memory usage.
#27
tiesanguaixia
opened
3 months ago
0
Can this model do qa tasks?
#26
leexinhao
closed
1 month ago
2
Question about fune-tune
#25
zhengxingmao
closed
2 months ago
7
Subset of YT-Temporal
#24
patrick-tssn
closed
3 months ago
1
Question about batch size
#23
gyxxyg
closed
3 months ago
1
the performance is very low on my own dataset.
#22
onlyonewater
closed
3 months ago
5
Experiment-related question
#21
zhaodongliang678
closed
3 months ago
3
Question about prompt
#20
Ironieser
closed
3 months ago
5
Question about the tokenizer
#19
gyxxyg
closed
3 months ago
5
Question about prompts.
#18
gyxxyg
closed
4 months ago
2
What is the relationship between segment and timetoken?
#17
sunwhw
closed
3 months ago
3
Inquiry on training cost
#16
HenryHZY
closed
4 months ago
2
Demo can‘t show the same desult
#15
xiaoxiaoli666
closed
4 months ago
1
Bad performance of Charades
#14
soyeonhong
closed
4 months ago
1
RAM and VRAM requirement
#13
Coronal-Halo
closed
4 months ago
2
Seeking Clarification about Fine-tuning Datasets
#12
ShramanPramanick
closed
5 months ago
2
Details of sliding qformer operation
#11
jihwanp
closed
5 months ago
1
Do we need to crop the HiREST videos?
#10
yeliudev
closed
5 months ago
14
torch.load raise TypeError: 'strict' is an invalid keyword argument for Unpickler()
#9
wwq66
closed
6 months ago
4
the generalization performance is bad when testing on custom videos.
#8
dragen1860
closed
5 months ago
1
Error in loading Video-LLaMA-2-7b_Finetuned
#7
dragen1860
closed
6 months ago
1
how to evaluation on activitynet-DVC?
#6
TXH-mercury
closed
6 months ago
3
When will the checkpoint and demo scripts be released?
#5
Hugh0120
closed
6 months ago
2
UnsatisfiableError
#4
LarryLeeee
closed
6 months ago
4
Checkpoints to run demo and dataset
#3
fazlicodes
closed
6 months ago
1
For different video datasets, is the frame density always drawn at intervals of 1 second?
#2
DuoLong
closed
6 months ago
5
A very good video-related work, it is convenient to open source the data set?
#1
Xujianzhong
closed
6 months ago
1