visual-captioning Search Results

illuin-tech/vidore-benchmark #51

Reproducing the Results in Table 2

The provided datasets have four variants, each serving a specific purpose, and contain a `text_description` as described below E.g gov: 1. **syntheticDocQA_government_reports_test** – **No text_des…

roipony updated 4 weeks ago

BradyFU/Awesome-Multimodal-Large-Language-Models #184

Add SALMONN, video-SALMONN, video-SALMONN 2

TCL606 updated 1 month ago

tychen-SJTU/MECD-Benchmark #30

Question about the multi-event dataset

Thank you for your meaningful work. I would like to ask that how the events defined in the video data? In other words, how to segment a video into multi-event segments? Thanks

lzc2017 updated 6 days ago

CAMMA-public/SurgVLP #4

How to perform the task of generating video captions

Great work! How to perform the task of generating video captions？

cascat0 updated 2 weeks ago

Chocobozzz/PeerTube #4505

Support captioning of Lives during broadcast for hugely impr…

**Describe the problem to be solved** Peertube's Live system is a really powerful and useful feature. Peertube's subtitle support for stored videos (and the ability to add them after the fact) is…

shibacomputer updated 2 months ago

huggingface/transformers #34169

Image-Text-to-Text Support in Transformers Pipeline

### Feature request Implement the new feature to support a pipeline that can take both an image and text as inputs, and produce a text output. This would be particularly useful for multi-modal tasks …

chakravarthik27 updated 1 month ago

aimagelab/meshed-memory-transformer #45

Error on testing the network on Windows 10

I'm traying to test the network on my windows 10 notebook. I configure all the packages but when the test start it gives me the next error: Traceback (most recent call last): File "", line 1, in…

marcomameli1992 updated 3 years ago

huggingface/transformers #34704

AttributeError when accessing .logits from BLIP-2 model outp…

### System Info - `transformers` version: 4.47.0.dev0 - Platform: Linux-5.15.0-94-generic-x86_64-with-glibc2.35 - Python version: 3.10.15 - Huggingface_hub version: 0.26.2 - Safetensors version: …

thisisiron updated 2 weeks ago

j-min/CLIP-Caption-Reward #6

Phase 1 validation throws: shape '[4, -1, 512]' is invalid f…

Hello, I have successfully generated all features (both text and visual) for the COCO dataset. However, when running MLE training, the code throws the following error at the moment it starts valida…

vbursztyn updated 2 years ago

adithya-s-k/World-of-AI #27

[ML category based PROJECT PROPOSAL]

## Project Request Image Captioning with Deep Learning This project aims to develop a model that automatically generates descriptive captions for images. --- | Field | Description …

aman-kumar29 updated 1 year ago

472 results for visual-captioning

472 results
for visual-captioning