dense-captioning Search Results

antoyang/VidChapters #22

Pretrained models performing poorly on dense video captionin…

The HowTo100M + VidChapters-7M + ViTT model is performing poorly on dense video captioning. Reproduction: Run ``` yt-dlp -P $TRANSFORMERS_CACHE -o video.mp4 https://www.youtube.com/watch?v=WJ…

lawrenceztang updated 3 weeks ago

jsbroks/coco-annotator #532

dense image captioning

how do I add region descriptions to an image (grounded image descriptions)? I would like to annotate my images using bounding boxes for the different regions and two or more text descriptions for each…

nevinmusula updated 10 months ago

JialianW/GRiT #9

Larger ViT backbone for dense captioning

Thank you for the nice work! Is it possible to use larger ViT backbone for dense captioning? Is there a reason that there is only ViT-B backbone for dense captioning? Thank you.

jun297 updated 1 year ago

v-iashin/MDVC #11

Dense Video Captioning on raw input videos

It seems a nice work. I wanted to test it on custom input videos. It would be very helpful if you can provide a script for generating video captions for a raw input video.

harpavatkeerti updated 1 year ago

Karine-Huang/T2I-CompBench #2

Dense Captioning Model for Attribute Binding Eval?

Thanks for the great work! I noticed that in the paper you mentioned that _"We observe that the major limitation of the BLIP-CLIP evaluation is that the BLIP captioning models do not always descr…

yinanyz updated 10 months ago

rd20karim/M2T-Segmentation #3

why the task is important

Hello, thank you for your work. I would like to ask why you think the task of synchronized subtitles is important. How can it help in action generation and action understanding?

xiaoxiaostudy updated 4 weeks ago

arXivTimes/arXivTimes #1124

End-to-End Dense Video Captioning with Masked Transformer

## 一言でいうと Transformerベースのモデルで、End2Endのビデオキャプションを実現したという研究。Encoder側は動画中からキャプション対象のイベント(時間範囲)を抽出し、Decoder側はイベントにマスクをかけた上で文の生成を行なっていく。 ![image](https://user-images.githubusercontent.com/544269/5370…

icoxfog417 updated 5 years ago

aigc-apps/EasyAnimate #55

Questions

How might EasyAnimate slice a 1080p video? Or more specifically what is the frame interval of which the slicing happens? Assuming this is the memory requirements for resolutions lower than 1080p. E…

radna0 updated 2 months ago

google-research/scenic #817

dense video object captioningn

When will your group release the code and dataset of dense video object captioning?

yahooo-m updated 11 months ago

ttengwang/Caption-Anything #10

Related work

Hello! Thank you so much for the contribution of this repo. I'm so interested in this work, and I'm suveying papers with key words like "captioning anything" or "instance level captioning" or "per pi…

LengZhuo0831 updated 1 year ago

160 results for dense-captioning

160 results
for dense-captioning