-
Hi, thanks for your work.
I would like to ask about what does the field "time" of breakpoint mode omean in your json file?
And when you evaluate LLM-based video understanding models like VideoChat, …
-
Hello team,
Thank you for this great work for video evaluation, could you add my new benchmark to the evaluation benchmarks
[[Project Page](https://vision-cair.github.io/InfiniBench/)] [[Code](htt…
-
## 0. 論文
### タイトル
Long-Term Feature Banks for Detailed Video Understanding
### リンク
http://openaccess.thecvf.com/content_CVPR_2019/papers/Wu_Long-Term_Feature_Banks_for_Detailed_Video_Underst…
-
Hello,In the evaluation Leaderboard, the value of 'Input' column is usually 'n frm', '16 frm' for example. How do I understand this value? Is 16 frames sampled from the entire video as input for the e…
-
Hi, thanks for the great work and the quick release of the codes!
I have a question regarding the memory module used in Spann3r. I have noticed that you use a similar approach to XMem originally desi…
-
### Use Cases
I would like to request a possible feature. Since patreon begin to store the video inside the platform is not possible to download the videos. I am patreon for many producers and I wish…
-
### CI Number
5389
### Duplicates
- [X] I have searched the existing issues
### Latest version
- [X] The issue is in the develop branch
- [X] The issue is in the latest released 4.1.x
### Descri…
-
*Sent by Google Scholar Alerts (scholaralerts-noreply@google.com). Created by [fire](https://fire.fundersclub.com/).*
---
###
###
### [PDF] [Attention Prompting on Image for Large Vision-Language…
-
https://github.com/OpenBMB/MiniCPM-V/blob/a209258d851f404485e5ae25864417dff3bb74ca/eval_mm/vlmevalkit/vlmeval/dataset/videomme.py Code says 8 frames are used for a video. But the leaderboard says (htt…
-
### Model description
MovieChat proposes a Vision Foundation model + LLM + Long short-term memory-based solution to long-range video understanding addressing computation, memory, and long-range tempo…