-
Cogview has these fine tuning abilities :
* Image to text
* Image text score
* Superresolution
I think they are all pretty cool and seem simple enough in the paper
I wonder if we could implemen…
-
In SpringRoll v2, we've been pulling out a lot of core things in favor of making the library smaller, and more flexible for end developers. However, a lot of devs do rely some of the built-in function…
-
# Feature request
I couldn't figure out what you use for captioning, but it often fails to recognize simple words. \
Is it possible to use OpenAI's Whisper? It's a lightweight, free and open-sourc…
-
Hi all,
I am currently doing a small project on image captioning. I came across QRNN and thought of replacing LSTM with QRNN. Everything was working fine with LSTM with longer training times but as…
-
[Region-Focused Network for Dense Captioning](https://github.com/VILAN-Lab/DesCap)无法访问了,希望能分享这篇文章的代码
-
First of all, thank you for making ShellCaster! It is very simple and intuitive, I like it.
Here are three suggestions to improve it:
- The details pane should be scrollable. Some podcasts come with…
-
## 🚀 Feature
Add `BaryScore`
### Sources:
- Paper: [Automatic Text Evaluation through the Lens of Wasserstein Barycenters](https://aclanthology.org/2021.emnlp-main.817.pdf) (EMNLP '21)
- [Rep…
-
Could this work achieve caption everything without any interaction like SAM?
-
#1. Title
Comment: **Can you clarify? Would users title their post like they do on Reddit?
This needs a bit more insight from journaling tools to see if title is a likely attribute to be used or if …
-
## 一言でいうと
CNNモデルの出力結果に対する可視化手法であるGrad-CAMの提案。入力画像に対する注目領域の可視化を行う。image captioningやvisual question answering(VQA)モデルにも適用可能。
行われている操作は以下のとおり。
(1) guided backpropagationで得られたサリエンシーマップを作成
(2) target c…