long8v / PTIR

Paper Today I Read
19 stars 0 forks source link

[162] CALIP: Zero-Shot Enhancement of CLIP with Parameter-free Attention #181

Open long8v opened 2 weeks ago

long8v commented 2 weeks ago
image

paper

TL;DR

Details

motivation

image

architecture

image

projection 하지 않은 feature에 대해 attention을 한 다음에 feature에 곱해주는 형태

image image

최종적인 예측은 이렇게 두 modality를 aggregate한 것에 대한 weighted sum

image image

Result

image