visual-captioning Search Results

473 results
for visual-captioning

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

ttengwang/Caption-Anything #10

Related work

Hello! Thank you so much for the contribution of this repo. I'm so interested in this work, and I'm suveying papers with key words like "captioning anything" or "instance level captioning" or "per pi…

LengZhuo0831 updated 1 year ago
2
microsoft/Oscar #49

Generating inputs to Oscar model

Hi Oscar Team, Thanks for the interesting paper and open-sourcing your model. On your [download](https://github.com/microsoft/Oscar/blob/master/DOWNLOAD.md) page, you mention that images are fe…

lukerm updated 5 months ago
15
ymym3412/acl-papers #245

Attacking Visual Language Grounding with Adversarial Example…

## 0. 論文 [Attacking Visual Language Grounding with Adversarial Examples: A Case Study on Neural Image Captioning](https://arxiv.org/abs/1712.02051v2) Hongge Chen, Huan Zhang, Pin-Yu Chen, Jinfeng Yi…

ymym3412 updated 6 years ago
1
chenfei-wu/TaskMatrix #432

SSLError: HTTPSConnectionPool(host='huggingface.co', port=44…

在执行代码`python visual_chatgpt.py --load ImageCaptioning_cpu,Text2Image_cpu` 出现这种情况： ![image](https://github.com/microsoft/TaskMatrix/assets/135227066/c8875c6b-c295-40c9-8ee5-c425f009f1c9) 最后报错：SSLErr…

sunshin2012 updated 1 year ago
2
Pindi-Chole/Video-Summary-Application #2

Encoder-Decoder Model

Find a suitable endcoder-decoder model and start traing the model with suitable datasets.

Riyu44 updated 1 year ago
2
batmanlab/BatmanLabWiki #34

Paper stack from CVPR 2018- Sumedha

- [ ] [ Neural Baby Talk](http://openaccess.thecvf.com/content_cvpr_2018/papers/Lu_Neural_Baby_Talk_CVPR_2018_paper.pdf) Keywords: Image captioning predict template-like sentences Reference: [Hy…

sumedhasingla updated 6 years ago
2
salesforce/LAVIS #520

BLIP-2 onnx support

I would like to request support to convert the blip-2 model for onnx conversion. I have tried to convert the model using torch.onnx.export method but there are issues as the input to the forward me…

jethrolow updated 3 months ago
6
hiyouga/LLaMA-Factory #6147

Does it support fine-tuning the PaliGemma model for object d…

In the paper about PaliGemma, it is indicated that it supports tasks such as Image Captioning, Visual Question Answering, Detection, and Referring Expression Segmentation. Can Llama-Factory suppor…

WangRongsheng updated 1 week ago
1
OFA-Sys/OFA #378

How to set the length of generated tokens?

Dear coauthors, - In pretraining/finetuning stage, for vision-language task (especially for visual_grounding and caption), can I set the length of generated tokens? Because I want a longer generated …

xcvil updated 1 year ago
1
long8v/PTIR #35

[30] CoCa: Contrastive Captioners are Image-Text Foundation …

[paper](https://arxiv.org/pdf/2205.01917.pdf) ## TL;DR **problem :** 좋은 vision backbone 만들기. 분류 레이블에 대한 이미지 프리트레이닝, 이미지-텍스트 pair를 받아 contrastive loss로 학습되는 dual-encoder model, image 인코더가 있고 …

long8v updated 8 months ago
2

上一页 1...1 2 3 4 5 6 7...48 下一页

473 results for visual-captioning

473 results
for visual-captioning