paperswithlove / papers-we-read

3 stars 0 forks source link

TextMonkey : An OCR-Free Large Multimodal Model for Understanding Document #2

Open soohwan-hyun opened 3 months ago

soohwan-hyun commented 3 months ago

Link : Arxiv. 2403.04473 Code : https://github.com/Yuliang-Liu/Monkey

Summarize from GPTs

스크린샷 2024-03-12 오전 11 39 57

Methodology

Token Resampler

image

image

image

Experiments

image

image

image

image

image

image

image

image