large-ocr-model / large-ocr-model.github.io

134 stars 5 forks source link

请求这个怎么与qwen_vl结合使用呢? #7

Open zhangjiekui opened 3 months ago

large-ocr-model commented 3 months ago

We will feed the images from our evaluation dataset through our OCR model to extract the corresponding textual output. Subsequently, this extracted text will be incorporated as supplementary data, co-inputted into the qwen_vl model, thereby offering it directional cues.

Cqjhj commented 3 months ago

We will feed the images from our evaluation dataset through our OCR model to extract the corresponding textual output. Subsequently, this extracted text will be incorporated as supplementary data, co-inputted into the qwen_vl model, thereby offering it directional cues.

请教一下:发现这样使用,大模型的ocr类回答会受限制于提示词中ocr的文字,如何让大模型能够提升ocr的准确率呢,例如在ocr结果明显错误的情况下,大模型也可以修正结果;要做到这个,是否应该去拿你们的数据集基于open-vl全参微调,不知道你们有没有做过相关尝试