-
I appreciate your working and it helped me lot to understand the flow of image captioning.
I got stuck in the decoder part. Kindly help me to understand and debug it.
You have created the encode…
-
## Adding a Dataset
- **Name:** COCO
- **Description:** COCO is a large-scale object detection, segmentation, and captioning dataset.
- **Paper + website:** https://cocodataset.org/#home
- **Data:…
-
Hi,
I tried to use coco caption to evaluate my neuraltalk2 results. but this error occured:
`Loading and preparing results...
Traceback (most recent call last):
File "myeval.py", line 29, in
…
-
could you please share the requirement.txt or the average_frame_features.pickle file.
which version of tensorflow used
tensorflow-valueerror-failed-to-convert-a-numpy-array-to-a-tensor-unsupported(n…
ghost updated
4 years ago
-
Built-in support for accessible content creation.
Web authors need the ability to add captions to videos and transcripts to audio and video content. As many users experience difficulties when readi…
-
Hey BLIP-2 team,
Thanks for your great work! I've been trying to reproduce the BLIP2 COCO ITM fine-tuning using the resources in your repo:
1. [train.py](https://github.com/salesforce/LAVIS/blob…
-
## タイトル: 見て、比較して、決める: 多視点多経路推論による大規模視覚言語モデルの幻覚軽減
## リンク: https://arxiv.org/abs/2408.17150
## 概要:
近年、大規模ビジョン言語モデル(LVLM)は、マルチモーダルな文脈理解において目覚ましい能力を示してきました。しかし、画像の内容と矛盾する出力を生成するという、幻覚問題に悩まされています。この幻覚を…
-
Hi,
Thanks for the great work and publicly available code.
For the TVC dataset, 3 FPS video frames are provided officially due to copyright issues. According to your code, it seems that you use…
-
To do prior to the webinar:
- [x] determine date and time
- [x] determine title
- [x] determine presenter/s
- [x] get abstract from presenter/s
- [x] collect relevant links and reading mate…
-
Make a video demoing the [Geographic Health Survey app](https://docs.odk-x.org/episample-intro/)! There is a good guided tour documented in these docs that you could work through on video. Do not need…