yuyq96 / TextHawk

Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models
44 stars 2 forks source link