visual-captioning Search Results

474 results
for visual-captioning

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

huggingface/transformers #5095

Addition of VisualBERT

# 🌟 New model addition ## Model description The VisualBERT model is used for multi-modal processing when the modes of images and text are present. It takes in object detection features from images, …

gchhablani updated 3 years ago
4
e4exp/paper_manager_abstract #261

Oscar: Object-Semantics Aligned Pre-training for Vision-Lang…

* https://arxiv.org/abs/2004.06165v5 * 2020 画像とテキストのペアにクロスモーダル表現を学習する大規模な事前学習手法が、視覚言語処理のために普及してきている。既存の手法は、単に画像領域の特徴とテキストの特徴を連結して事前学習モデルに入力し、自己注意を用いてブルートフォース的に画像とテキストの意味的整列を学習するだけであるが、本論文では、画像から…

e4exp updated 3 years ago
1
boost-devs/peer-session #101

[DAY 35] Multimodal captioning and speaking & 3D understandi…

## 오늘 배운 것 - **Multimodal captioning and speaking** - 시각적인 이미지 뿐 아니라 다양한 감각 (텍스트, 오디오 등)을 모두 사용해 학습하는 방법 - **3D understanding** - 3D space를 다룰 수 있는 Computer Vision model에 대한 이해 - 3D task,…

bsm8734 updated 3 years ago
1
Gitsamshi/WeakVRD-Captioning #8

您好，想请教下关于构建visual relation graph的问题

如文章中所说，一张图片经过目标检测网络可以得到n个区域，这n个区域两两匹配有n(n-1)种可能，每对区域都会经过predicate classification。如果每一对区域都预测出一个predicate，那么得到的visual relation graph就会非常杂乱。请问是否是当预测的predicate的概率大于某个阈值的时候，才判定这对区域之间存在联系呢？

n9705 updated 3 years ago
11
e4exp/paper_manager_abstract #262

Transform and Tell: Entity-Aware News Image Captioning

- https://arxiv.org/abs/2004.08070v2 - CVPR 2020 本研究では、ニュース記事に埋め込まれた画像のキャプションを生成するエンドツーエンドモデルを提案する。ニュース画像には2つの重要な課題があります：ニュース画像は実世界の知識、特に名前のついた実体に関する知識に依存しています。我々は、マルチモーダルなマルチヘッド注意メカニズムを用いて、…

e4exp updated 3 years ago
1
w3c/wcag #1658

Carousels fail 2.2.1: Timing Adjustable ?!

2.2.1 is pretty clear about time limits, focussing on carousels which start playing on page load I read: > Turn off > The user is allowed to turn off the time limit before encountering it The…

jake-abma updated 3 years ago
57
dataplat/DataSaturdays #27

Accessibility & Inclusivity

This is some slight overlap with #12 , but I think starting from scratch is a huge opportunity to make things as accessible and inclusive as possible (especially digital content). Some areas to th…

lowlydba updated 3 years ago
9
DirtyHarryLYL/HAKE-Action #72

有没有可能做到零样本？

有没有可能做到零样本（ZSL），就是直接通过part的state然后推到出新的action，不在这600类中。也就是说比如我手里有新的数据不属于这600类，而且没有标注训练集。

whqwill updated 4 years ago
5
w3c/wai-presentations2all #23

Terminology

@slhenry It might be good to extend the terminology section, currently only 4 terms. It might be good also to include terminology from different regions such as captions (US) and Subtitles for the Dea…

eoncins updated 4 years ago
4
jwyang/faster-rcnn.pytorch #27

Getting features for each image region

Hello, is it possible to use this implementation to get features for each image region like they do in the bottom-up attention model of [Bottom-Up and Top-Down Attention for Image Captioning and Vi…

claudiogreco updated 4 years ago
9

上一页 1...36 37 38 39 40 41 42...48 下一页

474 results for visual-captioning

474 results
for visual-captioning