有没有好一点的可以识别图片信息的本地模型

labring / FastGPT

FastGPT is a knowledge-based platform built on the LLMs, offers a comprehensive suite of out-of-the-box capabilities such as data processing, RAG retrieval, and visual AI workflow orchestration, letting you easily develop and deploy complex question-answering systems without the need for extensive setup or configuration.

https://tryfastgpt.ai

Other

18.65k stars 4.92k forks source link

有没有好一点的可以识别图片信息的本地模型 #2792

Open SDAIer opened 2 months ago

SDAIer commented 2 months ago

使用fastgpt--onapi调用ollama本地模型或者 xference模型，下载了好几个多模态的模型，图片识别效果都不准确。有没有好一点的可以识别图片信息的模型

据说qwen2-vl 效果可以，但是目前ollama还不支持，而xinference下载又总报错

以下几个都测试了，效果不好 minicpm-v:8b llava:13b bakllava blackened/llama-3-8b-gpt-4o-ru1.0:latest gemma2:27b llava-llama3

sevenclockseven commented 2 months ago

minicpm-v:8b-2.6-fp16我用的这个，感觉还可以，反正是比llava:34b好的多

SDAIer commented 2 months ago

多谢抽空测下

---原始邮件--- 发件人: @.> 发送时间: 2024年9月26日(周四) 下午2:25 收件人: @.>; 抄送: @.**@.>; 主题: Re: [labring/FastGPT] 有没有好一点的可以识别图片信息的本地模型 (Issue #2792)

minicpm-v:8b-2.6-fp16我用的这个，感觉还可以，反正是比llava:34b好的多

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

sunk926 commented 2 months ago

目前不是公认qwen2.5 7b 同等体积识别效果最好么?