Open ChristianWeyer opened 1 month ago
What is the issue?
I created an Ollama model (for the fp16 GGUF) based on this: https://github.com/OpenBMB/ollama/tree/minicpm-v2.5/examples/minicpm-v2.5
When testing one of my sample forms images, I get bad/wrong results when running the model locally via Ollama.
./ollama run minicpm-v2.5 >>> How is the Ending Balance? ./credit-card-statement.jpg Added image './credit-card-statement.jpg' The Ending Balance is 8,010.
I get the perfect and correct answers when using the same forms image in the online demo: https://huggingface.co/spaces/openbmb/MiniCPM-Llama3-V-2_5
What can we do to get the same quality here locally?
OS
macOS
GPU
Apple
CPU
Apple
Ollama version
Latest git commit (367ec3f)
I will test it and reply to you as soon as possible.
感谢这么快推出支持ollama的版本! 在macos中测试后,遇到同样的问题,测试多张图片,本地模型幻觉太严重,几乎不可用,不知是哪儿出问题了。
感谢这么快推出支持ollama的版本! 在macos中测试后,遇到同样的问题,测试多张图片,本地模型幻觉太严重,几乎不可用,不知是哪儿出问题了。
If we can get this model to run locally with the same quality as the online demo, this will be killer!
What is the issue?
I created an Ollama model (for the fp16 GGUF) based on this: https://github.com/OpenBMB/ollama/tree/minicpm-v2.5/examples/minicpm-v2.5
When testing one of my sample forms images, I get bad/wrong results when running the model locally via Ollama.
./ollama run minicpm-v2.5 >>> How is the Ending Balance? ./credit-card-statement.jpg Added image './credit-card-statement.jpg' The Ending Balance is 8,010.
I get the perfect and correct answers when using the same forms image in the online demo: https://huggingface.co/spaces/openbmb/MiniCPM-Llama3-V-2_5
What can we do to get the same quality here locally?
OS
macOS
GPU
Apple
CPU
Apple
Ollama version
Latest git commit (367ec3f)
There was an image-encoded bug in the previous code, and now it's the latest model result, and it looks okay.
Nice. Do I need to download the updated GGUF @tc-mb ?
Nice. Do I need to download the updated GGUF @tc-mb ?
No. I haven't changed gguf.
Nice. Do I need to download the updated GGUF @tc-mb ?
No. I haven't changed gguf.
OK, what exactly do I have to do now to test the changes? Thx!
Nice. Do I need to download the updated GGUF @tc-mb ?
No. I haven't changed gguf.
OK, what exactly do I have to do now to test the changes? Thx!
I think re-follow the readme such as pull code, rebuild, change modelfile, and run ollama. Remember to modify the "modelfile", the input order of the model has been changed. If any questions, feel free to ask me. I will reply ASAP.
@tc-mb OK, I pulled the latest commit. Built, tested...
It is much better, but still not correct. Hmm...
I also tested with other forms images, and it was better, but always did get it wrong in the end.
感谢这么快推出支持ollama的版本! 在macos中测试后,遇到同样的问题,测试多张图片,本地模型幻觉太严重,几乎不可用,不知是哪儿出问题了。
很抱歉,之前的代码在图像编码上有个bug,我已经进行了修改。 下面是我在你图片上面再次截图进行测试的结果,看起来还不太差,但仍然有瑕疵。
另外之前只上传了ggml-model-Q4_K_M.gguf一个量化版本,我们还在上传更多的gguf精度版本。 对于llama.cpp上导出的量化模型的精度评测需要c++重写,在之前我们没有像python版本全部评测每个数据集,我们将继续验证llama.cpp导出的量化模型的性能,这将会在之后放出,用于社区选用合适的gguf版本。
@tc-mb So, I think, I do have the latest code now.
Are you saying that I need a new F16 GGUF from https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5-gguf/tree/main - and just have to wait until it has been uploaded...?
So, I think, I do have the latest code now.
And are you saying that I need a new F16 GGUF from https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5-gguf/tree/main - and just have to wait...?
You can continue to use the previous gguf. We haven't updated the previous precision version.
Because there are 20-30 different precision versions in llama.cpp, we have only exported 2 versions before. Someone mentioned that needs more versions, and we are still uploading it one after another.
So, I think, I do have the latest code now. And are you saying that I need a new F16 GGUF from https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5-gguf/tree/main - and just have to wait...?
You can continue to use the previous gguf. We haven't updated the previous precision version.
Because there are 20-30 different precision versions in llama.cpp, we have only exported 2 versions before. Someone mentioned that needs more versions, and we are still uploading it one after another.
OK, cool.
So, any ideas what might still be wrong that I do not get the correct results?
So, I think, I do have the latest code now. And are you saying that I need a new F16 GGUF from https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5-gguf/tree/main - and just have to wait...?
You can continue to use the previous gguf. We haven't updated the previous precision version. Because there are 20-30 different precision versions in llama.cpp, we have only exported 2 versions before. Someone mentioned that needs more versions, and we are still uploading it one after another.
OK, cool.
So, any ideas what might still be wrong that I do not get the correct results?
I think we can look at it from two points first.
1 、git branch
If it's convenient, could you check the branches and versions of ollama and llama.cpp you're using?
Because ollama and llama.cpp are dependent, but if the version is wrong, it can still run, but the accuracy will be seriously lost.
You can use this command to view the git branch under the current folder and see where the latest head is.
git log -oneline
Here are the results on my side, which you can confirm.
ollama: It should be here: your_ollama_dir llama.cpp: It should be here: your_ollama_dir/llm/llama.cpp
2、original image Maybe it's the difference in the image, or you can send the original image, and I'll see if it's different from the screenshot I used.
git log -oneline
Same for me:
All the images I am testing have a wrong result. I can of course send you the images. Where and how?
Something is still quite wrong @tc-mb:
The demo instance (https://huggingface.co/spaces/openbmb/MiniCPM-Llama3-V-2_5) got this all right.
I am happy to help to make MiniCPM-Llama3-V-2_5] the best local VLM on earth @tc-mb 🙂. What further kind of tests could we do to improve the quality of the results?
谢谢这么快推出支持ollama的版本! 在macos中后,遇到同样的问题,多张图片,本地模型幻觉太严重,几乎不可用,不知是哪里出了问题。
很抱歉,之前代码在编码上有个 bug,我已经进行了修改。 下面是我在你图片上面再次截图进行测试的结果,看起来还不太好,但仍然有缺陷。
另外在只上传了ggml-model-Q4_K_M.gguf一个量化版本,我们还可以在上传更多的gguf精度版本。 对于llama.cpp上导出的量化模型的精度测评需要c++重写,在之前我还有很多像python版本一样的测评每个数据集,我们将继续验证llama.cpp导出的量化模型的性能,将会在之后放出,用于社区选用合适的gguf版本。
感谢你们的快速响应和改进程序! 刚才重新拉取了所有代码,按提示重新编译和运行了程序。 看前面的回复说GGUF的文件没有更新,没有重新下载GGUF,用的是我前两天自己从f16量化的Q8_0的GGUF文件以及上次的mmproj文件。 用上次的文件测试,有所改进,但和线上的版本还有差距,不知道是不是因为Q8量化的原因?
确实发现了,使用ollama本地运行minicpm-v模型,效果比demo差了很多,不确定是转化gguf模型损失了结构还是图像编码的问题,准备在本地使用原生模型运行一下看看效果
更新:使用原生的模型本地运行demo,运行良好。在使用llama.cpp或者ollama运行都出现了较为严重的幻觉或者结果错误。
更新:使用原生的模型本地运行demo,运行良好。在使用llama.cpp或者ollama运行都出现了较为严重的幻觉或者结果错误。
可以方便发下两个fork代码的分支和使用的modelfile不,我确认一下。
更新:使用原生的模型本地运行demo,运行良好。在使用llama.cpp或者ollama运行都出现了较为严重的幻觉或者结果错误。
可以方便发下两个fork代码的分支和使用的modelfile不,我确认一下。
modelfile如下: FROM ../MiniCPM-V-2_5/model/ggml-model-Q4_K_M.gguf FROM ../MiniCPM-V-2_5/mmproj-model-f16.gguf
TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|>
{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>
{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>
{{ .Response }}<|eot_id|>"""
PARAMETER stop "<|start_header_id|>" PARAMETER stop "<|end_header_id|>" PARAMETER stop "<|eot_id|>" PARAMETER num_keep 4 PARAMETER num_ctx 2048
另外补充一下,在minicpm-v2.5分支的ollama中,重新build docker,build过程没有问题,但docker启动后运行通过modelfile创建的模型会卡住,显存被占用,nvitop检查发现显卡没有被使用。运行llama3模型,工作正常。
更新:使用原生的模型本地运行demo,运行良好。在使用llama.cpp或者ollama运行都出现了较为严重的幻觉或者结果错误。
可以方便发下两个fork代码的分支和使用的modelfile不,我确认一下。
(venv) root@DESKTOP-MEGRI2B:/mnt/h/ollama# git branch
- minicpm-v2.5
因为之前的代码有个地方没对齐,导致图像vision有点bug,我后来改进的代码。 不知道你方便看下下面的命令行么。 git log -oneline 或者pull最新的代码。 不过如果你之前确认跑的是最新的代码的话,可以把一些图发在issue里,我会在我本地试一下。
更新:使用原生的模型本地运行demo,运行良好。在使用llama.cpp或者ollama运行都出现了较为严重的幻觉或者结果错误。
可以方便发下两个fork代码的分支和使用的modelfile不,我确认一下。
(venv) root@DESKTOP-MEGRI2B:/mnt/h/ollama# git branch
- minicpm-v2.5
因为之前的代码有个地方没对齐,导致图像vision有点bug,我后来改进的代码。 不知道你方便看下下面的命令行么。 git log -oneline 或者pull最新的代码。 不过如果你之前确认跑的是最新的代码的话,可以把一些图发在issue里,我会在我本地试一下。
分别是demo运行的和用ollama运行的
分别是demo运行的和用ollama运行的
可能我没太理解,这个问题在哪? matcha coconut granola crisp这四个词应该就是抹茶、椰子、燕麦、脆。 似乎是回答正确的?只是和demo上回答的不一样?
git log -oneline
Same for me:
All the images I am testing have a wrong result. I can of course send you the images. Where and how?
sorry, I was sorting out the code yesterday. PR was mentioned to llama.cpp officials a few hours ago. Maybe you can send the picture to issue, or send it to my email "caitinachi@modelbest.cn".
I am happy to help to make MiniCPM-Llama3-V-2_5] the best local VLM on earth @tc-mb 🙂. What further kind of tests could we do to improve the quality of the results?
I'm not sure if the streaming feature in ollama conflicts with the way I write it, but it takes longer to find out.
You can try to ask questions separately in the following ways to see if there is a problem with each answer.
./ollama create minicpmv -f openbmb/Modelfile
. /ollama run minicpmv "{your question}" {image_path}
For example:
. /ollama run minicpmv "How is the Ending Balance?" / Users/a0/Pictures/20240528-012828.jpeg
git log -oneline
Same for me: All the images I am testing have a wrong result. I can of course send you the images. Where and how?
sorry, I was sorting out the code yesterday. PR was mentioned to llama.cpp officials a few hours ago. Maybe you can send the picture to issue, or send it to my email "caitinachi@modelbest.cn".
Here we go :-).
I am happy to help to make MiniCPM-Llama3-V-2_5] the best local VLM on earth @tc-mb 🙂. What further kind of tests could we do to improve the quality of the results?
I'm not sure if the streaming feature in ollama conflicts with the way I write it, but it takes longer to find out. You can try to ask questions separately in the following ways to see if there is a problem with each answer.
. / ollama run test "{your question}" {image_path}.
For example:. / ollama run test "How is the Ending Balance?" / Users/a0/Pictures/20240528-012828.jpeg
Ah, OK. Where does the model name go in the command...?
OK, I modified the above reply. A command to create the environment has been added, which should be easy for you to view.
I am happy to help to make MiniCPM-Llama3-V-2_5] the best local VLM on earth @tc-mb 🙂. What further kind of tests could we do to improve the quality of the results?
I'm not sure if the streaming feature in ollama conflicts with the way I write it, but it takes longer to find out.
You can try to ask questions separately in the following ways to see if there is a problem with each answer.
./ollama create minicpmv -f openbmb/Modelfile
. /ollama run minicpmv "{your question}" {image_path}
For example:
. /ollama run minicpmv "How is the Ending Balance?" / Users/a0/Pictures/20240528-012828.jpeg
Uh, this is completely nuts...
./ollama run minicpm-v2.5:latest "How is the Ending Balance" "./credit-card-statement.jpg"
Added image './credit-card-statement.jpg'
The Ending value is 0.
I am happy to help to make MiniCPM-Llama3-V-2_5] the best local VLM on earth @tc-mb 🙂. What further kind of tests could we do to improve the quality of the results?
I'm not sure if the streaming feature in ollama conflicts with the way I write it, but it takes longer to find out. You can try to ask questions separately in the following ways to see if there is a problem with each answer.
./ollama create minicpmv -f openbmb/Modelfile
. /ollama run minicpmv "{your question}" {image_path}
For example:. /ollama run minicpmv "How is the Ending Balance?" / Users/a0/Pictures/20240528-012828.jpeg
Uh, this is completely nuts...
./ollama run minicpm-v2.5:latest "How is the Ending Balance" "./credit-card-statement.jpg" Added image './credit-card-statement.jpg' The Ending value is 0.
The image path is not enclosed in double quotes. This is the format defined by ollama.
The image path is not enclosed in double quotes. This is the format defined by ollama.
OK.
The result is completely wrong :-).
./ollama run minicpm-v2.5:latest "How is the Ending Balance" ./credit-card-statement.jpg
Added image './credit-card-statement.jpg'
3,448.10
... do we have any idea where to go from here @tc-mb ? Thanks!
What is the issue?
I created an Ollama model (for the fp16 GGUF) based on this: https://github.com/OpenBMB/ollama/tree/minicpm-v2.5/examples/minicpm-v2.5
When testing one of my sample forms images, I get bad/wrong results when running the model locally via Ollama.
I get the perfect and correct answers when using the same forms image in the online demo: https://huggingface.co/spaces/openbmb/MiniCPM-Llama3-V-2_5
What can we do to get the same quality here locally?
OS
macOS
GPU
Apple
CPU
Apple
Ollama version
Latest git commit (367ec3f)