long8v / PTIR

Paper Today I Read
19 stars 0 forks source link

[32] ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision #37

Open long8v opened 2 years ago

long8v commented 2 years ago

image

paper, code

TL;DR

Details

image