RenShuhuai-Andy TimeChat issues

RenShuhuai-Andy / TimeChat

[CVPR 2024] TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding

https://arxiv.org/abs/2312.02051

BSD 3-Clause "New" or "Revised" License

250 stars 21 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Question about the Text Input to the LLM

#42 ShramanPramanick closed 3 days ago
2
不能回答中文

#41 wublubdubdaxml closed 3 weeks ago
1
Ask for reproducing

#40 HYOJINPARK opened 3 weeks ago
5
Data type not aligned

#39 KKKLeon opened 1 month ago
1
Long video test results did not meet expectations

#38 ffiioonnaa closed 3 weeks ago
2
Discussion : Steps for swapping Llama 2 with Llama 3

#37 rahulkrprajapati closed 3 weeks ago
1
Weight for QA benchmarks

#36 NIneeeeeem closed 1 month ago
3
Asking for the Fine-tuned Checkpoint

#35 minjoong507 closed 1 month ago
2
Could you test TimeChat on the EgoShema dataset?

#34 EricLina closed 1 month ago
2
Why the result of temporal video grounding is always the multiple of 5?

#33 zhengrongz closed 2 months ago
5
用自己的数据集finetune，如何在train的过程中进行eval？

#32 changqinyao opened 2 months ago
1
Based transformers version needed for modifying models/modeling_llama.py

#31 yeahjack opened 2 months ago
3
Questions about the provided fine-tuning model parameters

#30 LanXingXuan closed 2 months ago
1
Inference with audio

#29 lakshya-frontera closed 3 weeks ago
2
Question about the output of the time-aware frame encoder

#28 Mingxiao-Li closed 2 months ago
2
When conducting SFT experiments, setting batch_size_train to 1 or 2 has the same memory usage.

#27 tiesanguaixia opened 3 months ago
0
Can this model do qa tasks？

#26 leexinhao closed 1 month ago
2
Question about fune-tune

#25 zhengxingmao closed 2 months ago
7
Subset of YT-Temporal

#24 patrick-tssn closed 3 months ago
1
Question about batch size

#23 gyxxyg closed 3 months ago
1
the performance is very low on my own dataset.

#22 onlyonewater closed 3 months ago
5
Experiment-related question

#21 zhaodongliang678 closed 3 months ago
3
Question about prompt

#20 Ironieser closed 3 months ago
5
Question about the tokenizer

#19 gyxxyg closed 3 months ago
5
Question about prompts.

#18 gyxxyg closed 4 months ago
2
What is the relationship between segment and timetoken？

#17 sunwhw closed 3 months ago
3
Inquiry on training cost

#16 HenryHZY closed 4 months ago
2
Demo can‘t show the same desult

#15 xiaoxiaoli666 closed 4 months ago
1
Bad performance of Charades

#14 soyeonhong closed 4 months ago
1
RAM and VRAM requirement

#13 Coronal-Halo closed 4 months ago
2
Seeking Clarification about Fine-tuning Datasets

#12 ShramanPramanick closed 5 months ago
2
Details of sliding qformer operation

#11 jihwanp closed 5 months ago
1
Do we need to crop the HiREST videos?

#10 yeliudev closed 5 months ago
14
torch.load raise TypeError: 'strict' is an invalid keyword argument for Unpickler()

#9 wwq66 closed 6 months ago
4
the generalization performance is bad when testing on custom videos.

#8 dragen1860 closed 5 months ago
1
Error in loading Video-LLaMA-2-7b_Finetuned

#7 dragen1860 closed 6 months ago
1
how to evaluation on activitynet-DVC?

#6 TXH-mercury closed 6 months ago
3
When will the checkpoint and demo scripts be released?

#5 Hugh0120 closed 6 months ago
2
UnsatisfiableError

#4 LarryLeeee closed 6 months ago
4
Checkpoints to run demo and dataset

#3 fazlicodes closed 6 months ago
1
For different video datasets, is the frame density always drawn at intervals of 1 second?

#2 DuoLong closed 6 months ago
5
A very good video-related work, it is convenient to open source the data set？

#1 Xujianzhong closed 6 months ago
1