captioning Search Results

1000+ results
for captioning

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

sgrvinod/a-PyTorch-Tutorial-to-Image-Captioning #109

how do I get Decoder_hidden???

I appreciate your working and it helped me lot to understand the flow of image captioning. I got stuck in the decoder part. Kindly help me to understand and debug it. You have created the encode…

rohit-shekhar26 updated 4 years ago
1
huggingface/datasets #2526

Add COCO datasets

## Adding a Dataset - **Name:** COCO - **Description:** COCO is a large-scale object detection, segmentation, and captioning dataset. - **Paper + website:** https://cocodataset.org/#home - **Data:…

NielsRogge updated 1 year ago
17
tylin/coco-caption #36

Error while using coco-caption to evaluate

Hi, I tried to use coco caption to evaluate my neuraltalk2 results. but this error occured: `Loading and preparing results... Traceback (most recent call last): File "myeval.py", line 29, in …

Mozhdeh-d updated 4 years ago
4
rohit-gupta/Video2Language #13

could not find average_frame_features.pickle pickle file and…

could you please share the requirement.txt or the average_frame_features.pickle file. which version of tensorflow used tensorflow-valueerror-failed-to-convert-a-numpy-array-to-a-tensor-unsupported(n…

ghost updated 4 years ago
3
umbraco/Umbraco-CMS.Accessibility.Issues #83

ATAG - Feature - Video and audio

Built-in support for accessible content creation. Web authors need the ability to add captions to videos and transcripts to audio and video content. As many users experience difficulties when readi…

DannyLancaster updated 1 year ago
2
salesforce/LAVIS #275

Reproducing BLIP2 COCO ITM Fine-tuning and Adding New Data

Hey BLIP-2 team, Thanks for your great work! I've been trying to reproduce the BLIP2 COCO ITM fine-tuning using the resources in your repo: 1. [train.py](https://github.com/salesforce/LAVIS/blob…

yonatanbitton updated 1 year ago
6
fulfulggg/Information-gathering #194

見て、比較して、決める: 多視点多経路推論による大規模視覚言語モデルの幻覚軽減

## タイトル: 見て、比較して、決める: 多視点多経路推論による大規模視覚言語モデルの幻覚軽減 ## リンク: https://arxiv.org/abs/2408.17150 ## 概要: 近年、大規模ビジョン言語モデル（LVLM）は、マルチモーダルな文脈理解において目覚ましい能力を示してきました。しかし、画像の内容と矛盾する出力を生成するという、幻覚問題に悩まされています。この幻覚を…

fulfulggg updated 2 weeks ago
2
microsoft/SwinBERT #18

TVC Dataset

Hi, Thanks for the great work and publicly available code. For the TVC dataset, 3 FPS video frames are provided officially due to copyright issues. According to your code, it seems that you use…

engindeniz updated 2 years ago
1
VertNet/dwc-qa-manage #44

BioCASe Provider Software - the Hands-on Session

To do prior to the webinar: - [x] determine date and time - [x] determine title - [x] determine presenter/s - [x] get abstract from presenter/s - [x] collect relevant links and reading mate…

ekrimmel updated 4 years ago
4
odk-x/tool-suite-X #259

Video: Demo Geographic Health Survey app

Make a video demoing the [Geographic Health Survey app](https://docs.odk-x.org/episample-intro/)! There is a good guided tour documented in these docs that you could work through on video. Do not need…

elmps2018 updated 1 year ago
37

上一页 1...84 85 86 87 88 89 90...100 下一页

1000+ results for captioning

1000+ results
for captioning