-
- https://arxiv.org/abs/2109.12178
- 2021
視覚と言語の事前学習(VLP)は,画像やテキストの入力を必要とする下流のタスクのモデル性能を向上させる.
現在のVLPアプローチは、
(i)モデルアーキテクチャ(特に画像エンベッダー)、
(ii)損失関数、
(iii)マスキングポリシーによって異なります。
画像エンベッダーは、ResNet…
e4exp updated
2 years ago
-
Thanks for the awesome Grounding-DINO, I share our recent work 🦖OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion.
* OV-DINO is a novel unified open vocabulary detecti…
-
- https://arxiv.org/abs/2103.06561
- 2021
近年、視覚と言語の橋渡しを目的としたマルチモーダルな事前学習モデルが盛んに研究されています。
しかし、これらのモデルの多くは、テキストと画像の間に強い相関関係があると仮定することで、画像とテキストのペアの間のクロスモーダルな相互作用を明示的にモデル化しています。
この強い仮定は実世界のシナリオでは無効で…
e4exp updated
3 years ago
-
Thanks for the awesome GLIP, I share our recent work 🦖OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion.
* OV-DINO is a novel unified open vocabulary detection approac…
-
Thanks for the awesome YOLO-World, I share our recent work 🦖OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion.
* OV-DINO is a novel unified open vocabulary detection …
-
You will see the problem in the text below, this is with using gpt-4o and version 0.5 of agent zero, but have similar issues with other models
User message ('e' to leave):
> Write a college level …
-
[https://arxiv.org/pdf/2404.06512.pdf](https://arxiv.org/pdf/2404.06512.pdf)
[https://github.com/InternLM/InternLM-XComposer](https://github.com/InternLM/InternLM-XComposer)
### preview
- 건강검진 …
-
As mentioned in the paper, you use 20% training data(around 16M*0.2 = 3.2M) to train the model. I have some questions about it.
Previously, the baseline model ABINet consists of three stages: vision …
-
Thank you very much for your excellent work! We had already run the model by using the demo and found out that the ability of Theia model on feature extraction visualization was not as good as individ…
-
Thanks for your awesome work in model merging! I'm excited about the improvements you achieved compare to other merging methods. However, I saw the individually fine-tuned models still out-perform WEM…