-
使用的是bge-m3,
candi_emb_1 = model.encode(text="The Mid-Hudson Bridge, spanning the Hudson River between Poughkeepsie and Highland.", image="./imgs/wiki_candi_1.jpg")
-
- https://arxiv.org/abs/2103.04037
- 2021
トランスフォーマーアーキテクチャは、長年リカレントニューラルネットワークに支配されていた計算言語学の分野に根本的な変化をもたらしました。
その成功は、言語と視覚のクロスモーダルなタスクにも劇的な変化をもたらし、多くの研究者がすでにこの問題に取り組んでいます。
本論文では、この分野における最も重要なマイル…
e4exp updated
3 years ago
-
## FIRE (Fine-grained Image-text Retrieval with Explicit focus on semantic objects)
- 기존 베이스라인 방법론은 semantic object에 Implicit하게 focusing 된 Image/Text representation 활용
- Image/Text에 추가적인 모듈을 더하여 E…
-
微博内容精选
-
거의 scheming만 했던 논문 모아놓는 곳.
notion에 정리중이었으나 link를 걸기가 어려워서 옮김.
-
1. 我按照readme提供的参数,无法收敛,观察到日志中:
1. 05/20/2022 00:18:31 - INFO - Weight doesn't exsits. xxx/modules/cross-base/cross_pytorch_model.bin
2. 05/20/2022 00:18:42 - INFO - Weights from pretrained model…
-
-
hi lin, i managed to write a finetuning script, could you help me check it? i also got confused about some details, listed below(also marked with NOTE in code comments), could you illustrate somehow? …
-
Thanks for sharing great work and dataset!
I have two questions about paper.
First of all, I think that authors mainly follow the losses and architecture of ALBEF. But, CTP do not use the ITM loss…
-
表格检测
>哪些区域是表格 哪些不是(是文本、图表)
表格结构识别
>哪些是表名、标题、表头、行和列、单元格网格结构
表格数据语义提取
> table interpretation: rediscovering the meaning of the
tabular structure. This includes:
(a) functional analysis: deter…