-
Hi, guys! Thank you for the project a lot. But I have an issue with downloading pretrained models using download_models.sh. I've tied different networks, but it fails all the time. Do you have another…
-
thanks advance!
-
Hi, I am going to reproduce the reported performance on MSVD dataset with CIDEr of 120.6, but there exists a gap. In my experiment, the first evaluation after the initialization is poor, the initializ…
-
Hi,
Thanks for the nice library. I found DALI while looking for a video loader for action recognition. I found that DALI yet cannot handle various resolution as in the issue #725 which is necessary f…
-
bash eval.sh
启动脚本如下:
```
#!/bin/bash
DIR="VTG-LLM"
MODEL_DIR="/home1/lw/fyy/VTG-LLM/vtgllm.pth"
# TASK='dvc'
# ANNO_DIR='data/VTG-IT/dense_video_caption/Youcook2'
# VIDEO_DIR='data/youco…
-
Hi, How do you get the videos for the youcook2 datasets, since they only provide annotations? Would I need to download each video from youtube? Or do you provide embeddings for the videos?
Thanks
-
Hi, thanks for your great work of VideoChat2!
I tried to organize the Ego4d dataset used in the paper. But I found that there are several splits for each video, and the split information is unavail…
-
In new v1.5 version of https://github.com/Efficient-Large-Model/VILA/blob/main/data_prepare/README.md
there are links to new dataset annotation files such as
`huggingface-cli download mit-han-lab/vi…
-
Hello author, when I tested the performance of the provided timechat_7b.pth, I found that the measured indicators were lower than the results reported in the paper. I fine-tuned Timechat according to …
-
Any chance of releasing the weights? I currently lack the compute to train this myself. Thanks!