-
Hi, how can I add the visual7w dataset for the VQA task? The adding datasets documentation is for AVSD task and I'm not sure how to do similar steps but for a different task... My data has images, que…
-
# URL
- https://arxiv.org/abs/2408.02272
# Affiliations
- Koki Maeda, N/A
- Tosho Hirasawa, N/A
- Atsushi Hashimoto, N/A
- Jun Harashima, N/A
- Leszek Rybicki, N/A
- Yusuke Fukasawa, N/A
…
-
## タイトル: 見るか推測するか:反事実的に正則化された画像キャプション生成
## リンク: https://arxiv.org/abs/2408.16809
## 概要:
画像の内容を自然言語で記述する画像キャプション生成は、視覚と言語の研究において重要なタスクです。従来のモデルは、既存のデータセットの統計的な適合を通じて、機械の生成能力を人間の知能に近づけることで、このタスクに取り組…
-
-
### Is there an existing issue for this?
- [X] I have searched the existing issues and checked the recent builds/commits
### What would your feature do ?
It will use blip2 models for text desc of i…
-
from config import TF_MODELS_PATH
ImportError: cannot import name 'TF_MODELS_PATH' from 'config' (D:\python3.7\lib\site-packages\config\__init__.py)
what's the mean of TF_MODELS_PATH? I can't find…
-
Hello! Thanks for your wonderful work. May I know how to decode GQA pretrained feature files? Specifically, how to convert the base64 encoded features (data in features.tsv) to floating points? Thanks…
-
I want to reproduce your code and want to run it on Visual Genome1.4 dataset. However, I cannot find the corresponding TXT file for train, val, and test when loading the dataset. Can you put these thr…
-
Hello, I used the code and weights you provided to execute the inference.py file, but the results seem to be very different from what is shown. Do you know what is the reason for this please?
![ima…
-
## Background
We should also make sure that our documentation is kept up to date.
A scour through the open issues in this repo and also on [StackOverflow](https://stackoverflow.com/questions/tag…