-
I read in the readme file, paligemma can captioning a short video, anyone can guide me to do that?
Does it extract every frames on the video? Or does the paligemma tokenizer directly support video…
-
Hey @Ino-Ichan
Thx so much for your work!
does GIT-LLM support video as input as original GIT2?
-
Thank you for your meaningful work.
I would like to ask that how the events defined in the video data? In other words, how to segment a video into multi-event segments? Thanks
-
The HowTo100M + VidChapters-7M + ViTT model is performing poorly on dense video captioning.
Reproduction:
Run
```
yt-dlp -P $TRANSFORMERS_CACHE -o video.mp4 https://www.youtube.com/watch?v=WJ…
-
Thanks to the awesome work!
I'm interested in video captioning, and can you share the captioning checkpoint?
Thanks a lot
-
The Lightning talks usually consist of multiple talks by different speakers about different topics. I think we should split up the lightning talks video into multiple individual "talks" on RubyVideo. …
-
Thanks for your awesome work and selfless open-source!
Could you please provide the code for batch inference in simple tasks like video captioning, which is very useful in testing.
Sincerely hope f…
-
@iory could you add video captioning node ?
fyi: @a-ichikura
-
Hello! Could you please add SALMONN series models?
Title | Venue | Date | Code | Demo
-- | -- | -- | -- | --
[SALMONN: Towards Generic Hearing Abilities for Large Language Models](https://arxiv.o…
-
Hi, when the input is video clips, how to calculate the score of video captioning?