OpenBMB / MiniCPM-V

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
Apache License 2.0
11.91k stars 840 forks source link

[Ollama] Ollama model does have significantly lower quality in answering than online demo #375

Closed ChristianWeyer closed 1 month ago

ChristianWeyer commented 1 month ago

Referring to https://github.com/OpenBMB/ollama/issues/3#issuecomment-2260209553

:-)

Thanks @tc-mb!

hvico commented 1 month ago

Hi! Same here, tried to reproduce the inference of the online demo using Llama.cpp fork, quality is much worse (using the same params). As Ollama is Llama.cpp based, I would think this two problems are the same. Tried with Q4 and FP16 gguf versions. Thanks!

tc-mb commented 1 month ago

@ChristianWeyer @hvico Sorry, we have found the problem. There are some inconsistencies between our python code and our original idea. Our c++ code is written according to our understanding, but because python has some differences, we can only achieve the same accuracy level by modifying the c++ code(in this commit). I have restored the accuracy in the official branch. You can use https://github.com/OpenBMB/llama.cpp/tree/prepare-PR-of-minicpm-v2.5 This branch will get the c++ accuracy closest to the python version. Regarding ollama, I promise to solve it tomorrow.

ChristianWeyer commented 1 month ago

Where can we find the updated stuff @tc-mb @Cuiunbo ?

ChristianWeyer commented 1 month ago

Where can we find the updated stuff @tc-mb @Cuiunbo ?

tc-mb commented 1 month ago

Where can we find the updated stuff @tc-mb @Cuiunbo ?

I have modified it, and you should be able to get good enough results with minicpmv2.5. I am sorry that I did not inform you in time even though I made the modification.

ChristianWeyer commented 1 month ago

Where can we find the updated stuff @tc-mb @Cuiunbo ?

I have modified it, and you should be able to get good enough results with minicpmv2.5. I am sorry that I did not inform you in time even though I made the modification.

What has changed? How does the setup look like in the meantime? 🙂 Which tools and which model file?

I still like to use Ollama.