X-PLUG / mPLUG-DocOwl

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
Apache License 2.0
1.29k stars 83 forks source link

按照hugging face的上的参数load pretrainted结果infer出来的东西是混乱的呢? #14

Open yaboojia opened 8 months ago

yaboojia commented 8 months ago

按照uread的方式加载了hugging face上的预训练的模型。输入了一张简单的图片: 424777-PE9BDR-101

结果输出的是看不懂的东西: iwEdAqNwbmcDAQTRA7oF0QJvBrD8kNF7ARjUZAWLq7Xt_WIAB9MAAAAA8ugD3QgACaJpbQoAC9IAAecr png_720x720q90

不知道哪里出了问题?是不是参数有变化呢?

HAWLYQ commented 5 months ago

Hi, @yaboojia , 我们更新了一版DocOwl,你可以在DocOwl1.5中尝试新的inference code,也可以在huggingface或modelscope上的demo上测试: 538E06FE-FF8D-44D9-B105-209907A69C8F