X-PLUG / mPLUG-DocOwl

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
Apache License 2.0
1.12k stars 68 forks source link

DocOwl 1.5 position tokens #84

Closed j-min closed 2 weeks ago

j-min commented 2 weeks ago

Hi, it seems like the position indicators such as <global_img>, <|image|>, <crop_img_row0_col0> are not part of mPLUG-DocOwl 1.5 tokenizer's vocab. Is this expected?

image
j-min commented 2 weeks ago
image

Oh I just read your table 4 where you found textual tokens more effective. Interesting! Closing the issue.