allenai / vila

Incorporating VIsual LAyout Structures for Scientific Text Classification
Apache License 2.0
167 stars 17 forks source link

In fact, the VILA model only use the 'token' and 'box' to predict, the block is useless #38

Open xsank opened 2 weeks ago

xsank commented 2 weeks ago

RT