-
The HowTo100M + VidChapters-7M + ViTT model is performing poorly on dense video captioning.
Reproduction:
Run
```
yt-dlp -P $TRANSFORMERS_CACHE -o video.mp4 https://www.youtube.com/watch?v=WJ…
-
Hello! Could you please add SALMONN series models?
Title | Venue | Date | Code | Demo
-- | -- | -- | -- | --
[SALMONN: Towards Generic Hearing Abilities for Large Language Models](https://arxiv.o…
-
Great work!
How to perform the task of generating video captions?
-
Thank you for sharing this amazing work, we are interested in trying the extracted semantics of your trained models from videos related to action classification. However, we are not sure if this is po…
-
I am getting "" as output, can't seem to figure out the issue.
```
import torch
from videollava.conversation import conv_templates, SeparatorStyle
from videollava.model.builder import load_pre…
-
### My actions before raising this issue
- [x] Read/searched [the docs](https://github.com/opencv/cvat/tree/master#documentation)
- [x] Searched [past issues](/issues)
Feature request: recent…
-
Hi, thanks for your great work!
I'm checking at the new released model internVideo2, it's interesting!
I saw demo.ipynb files in multi_modality folder, it can calculate text prob.
I'm wondering if …
-
I am currently working on improving video transcriptions using the OpenAI API and have successfully integrated a solution that enhances transcription accuracy. However, I believe that extending the fu…
-
I run the captioning file.
`python inference.py --video-list inputs/video_list.txt --prompt-list inputs/prompt_list.txt`
and encountered the following issues
`/root/anaconda3/envs/panda70m_captio…
-
### Proposal summary
## Feature Request
Enable Opik to display additional media formats, including audio, PDF, and video.
## Background
Opik currently supports only image display, which li…