OpenBMB / MiniCPM-V

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
Apache License 2.0
11.82k stars 829 forks source link

ollama支持吗 #97

Closed catzqaz closed 3 months ago

duanshuaimin commented 3 months ago

ollama supported is very useful for beginners,thanks

hexf00 commented 3 months ago

I'm also in need of this, and I'm confident that integrating with ollama will lead to widespread distribution and usage.

yunzhichen commented 3 months ago

请问在ollama中如何加载int4的2.5版本?目前看到有两个.safetensors文件,是否需要合并?如何合并?非常感谢。 Pls how to load version 2.5 of int4 in ollama? Thank you so much. Currently seeing two .safetensors files, do I need to merge them? How do I merge? Thank you so much.

yuanjie-ai commented 3 months ago

哎 支持下主流的部署哇 ollama

tyzero commented 3 months ago

ollama只支持GGUF格式的模型进行导入。对于pytorch和safetensors的模型,需要转换为gguf格式之后再导入。 具体步骤,请参考:https://github.com/ollama/ollama/blob/main/docs/import.md

看了下,safetensors格式的要自己转换一下

tyzero commented 3 months ago
raise NotImplementedError(f'Architecture {arch!r} not supported!') from None

NotImplementedError: Architecture 'MiniCPMV' not supported! 好像转换不了。。

leeaction commented 3 months ago

有gguf文件还不行 还需要编译Ollama的backend llama.cpp

Cuiunbo commented 3 months ago

MiniCPM-Llama3-V 2.5 can run with llama.cpp now! See our fork of llama.cpp for more detail.

and here is our model in gguf format. https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5-gguf @duanshuaimin @leeaction @tyzero @hexf00

duanshuaimin commented 3 months ago

MiniCPM-Llama3-V 2.5 can run with llama.cpp now! See our fork of llama.cpp for more detail.

and here is our model in gguf format. https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5-gguf @duanshuaimin @leeaction @tyzero @hexf00

thanks for your hard work

yuanjie-ai commented 3 months ago

MiniCPM-Llama3-V 2.5 can run with llama.cpp now! See our fork of llama.cpp for more detail.

and here is our model in gguf format. https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5-gguf @duanshuaimin @leeaction @tyzero @hexf00

mmproj-model-f16.gguf这个文件有用吗

kotaxyz commented 3 months ago

why does it hallucinate like that

https://github.com/OpenBMB/MiniCPM-V/assets/105466290/adb052b0-d1e7-4b08-bf7d-298c8f662ff6

tc-mb commented 3 months ago

fork of

Ollama does not directly support every model. I will continue to do an ollama fork this week, so that the community can use ollama run MiniCPMV2.5.

tc-mb commented 3 months ago

MiniCPM-Llama3-V 2.5 can run with llama.cpp now! See our fork of llama.cpp for more detail. and here is our model in gguf format. https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5-gguf @duanshuaimin @leeaction @tyzero @hexf00

mmproj-model-f16.gguf这个文件有用吗

有用,这个文件是用于描述多模态模型中的图像部分。

tc-mb commented 3 months ago

why does it hallucinate like that

Video_2024-05-24_044143.mp4

It seems to be because the mmproj-model-f16.gguf is not used, make model loses the input of visual information.

I will make a version of ollama that supports MiniCPMV with instructions for use ASAP.

duanshuaimin commented 3 months ago

create ollama modelfile like :

FROM ./ggml-model-Q4_K_M.gguf TEMPLATE "{{ if .System }}<|start_header_id|>system<|end_header_id|>

{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>

{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>

{{ .Response }}<|eot_id|>" PARAMETER num_keep 24 PARAMETER stop <|start_header_id|> PARAMETER stop <|end_header_id|> PARAMETER stop <|eot_id|>

and execute command :
ollama create xxxx(model name) -f modelfile so you can run ollama locally

doc: https://github.com/ollama/ollama/blob/main/docs/modelfile.md https://github.com/ollama/ollama/blob/main/docs/api.md

duanshuaimin commented 3 months ago

why does it hallucinate like that Video_2024-05-24_044143.mp4

It seems to be because the mmproj-model-f16.gguf is not used, make model loses the input of visual information.

I will make a version of ollama that supports MiniCPMV with instructions for use ASAP.

new version released and ollama supported with priority

seasoncool commented 3 months ago

可否出一個 F16的 gguf格式,Q4的量化 沒有吃滿顯存。

我們嘗試按照readme轉換一了下,發現效果也不是很好,還請指正。

readme

image

image

leeaction commented 3 months ago

create ollama modelfile like :

FROM ./ggml-model-Q4_K_M.gguf TEMPLATE "{{ if .System }}<|start_header_id|>system<|end_header_id|>

{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>

{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>

{{ .Response }}<|eot_id|>" PARAMETER num_keep 24 PARAMETER stop <|start_header_id|> PARAMETER stop <|end_header_id|> PARAMETER stop <|eot_id|>

and execute command : ollama create xxxx(model name) -f modelfile so you can run ollama locally

doc: https://github.com/ollama/ollama/blob/main/docs/modelfile.md https://github.com/ollama/ollama/blob/main/docs/api.md

Thank you for you awsome work , I tried this on ollama and meet a issue

I used ollama chat api with images data

The request is:

curl http://127.0.0.1:11434/api/chat -d '{
  "model": "mini-cpm-v2",
  "stream": false,
  "messages": [
    {
      "role": "user",
      "content": "what is in this image?",
      "images": ["iVBORw0KGgoAAAANSUhEUgAAAG0AAABmCAYAAADBPx+VAAAACXBIWXMAAAsTAAALEwEAmpwYAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAA3VSURBVHgB7Z27r0zdG8fX743i1bi1ikMoFMQloXRpKFFIqI7LH4BEQ+NWIkjQuSWCRIEoULk0gsK1kCBI0IhrQVT7tz/7zZo888yz1r7MnDl7z5xvsjkzs2fP3uu71nNfa7lkAsm7d++Sffv2JbNmzUqcc8m0adOSzZs3Z+/XES4ZckAWJEGWPiCxjsQNLWmQsWjRIpMseaxcuTKpG/7HP27I8P79e7dq1ars/yL4/v27S0ejqwv+cUOGEGGpKHR37tzJCEpHV9tnT58+dXXCJDdECBE2Ojrqjh071hpNECjx4cMHVycM1Uhbv359B2F79+51586daxN/+pyRkRFXKyRDAqxEp4yMlDDzXG1NPnnyJKkThoK0VFd1ELZu3TrzXKxKfW7dMBQ6bcuWLW2v0VlHjx41z717927ba22U9APcw7Nnz1oGEPeL3m3p2mTAYYnFmMOMXybPPXv2bNIPpFZr1NHn4HMw0KRBjg9NuRw95s8PEcz/6DZELQd/09C9QGq5RsmSRybqkwHGjh07OsJSsYYm3ijPpyHzoiacg35MLdDSIS/O1yM778jOTwYUkKNHWUzUWaOsylE00MyI0fcnOwIdjvtNdW/HZwNLGg+sR1kMepSNJXmIwxBZiG8tDTpEZzKg0GItNsosY8USkxDhD0Rinuiko2gfL/RbiD2LZAjU9zKQJj8RDR0vJBR1/Phx9+PHj9Z7REF4nTZkxzX4LCXHrV271qXkBAPGfP/atWvu/PnzHe4C97F48eIsRLZ9+3a3f/9+87dwP1JxaF7/3r17ba+5l4EcaVo0lj3SBq5kGTJSQmLWMjgYNei2GPT1MuMqGTDEFHzeQSP2wi/jGnkmPJ/nhccs44jvDAxpVcxnq0F6eT8h4ni/iIWpR5lPyA6ETkNXoSukvpJAD3AsXLiwpZs49+fPn5ke4j10TqYvegSfn0OnafC+Tv9ooA/JPkgQysqQNBzagXY55nO/oa1F7qvIPWkRL12WRpMWUvpVDYmxAPehxWSe8ZEXL20sadYIozfmNch4QJPAfeJgW3rNsnzphBKNJM2KKODo1rVOMRYik5ETy3ix4qWNI81qAAirizgMIc+yhTytx0JWZuNI03qsrgWlGtwjoS9XwgUhWGyhUaRZZQNNIEwCiXD16tXcAHUs79co0vSD8rrJCIW98pzvxpAWyyo3HYwqS0+H0BjStClcZJT5coMm6D2LOF8TolGJtK9fvyZpyiC5ePFi9nc/oJU4eiEP0jVoAnHa9wyJycITMP78+eMeP37sXrx44d6+fdt6f82aNdkx1pg9e3Zb5W+RSRE+n+VjksQWifvVaTKFhn5O8my63K8Qabdv33b379/PiAP//vuvW7BggZszZ072/+TJk91YgkafPn166zXB1rQHFvouAWHq9z3SEevSUerqCn2/dDCeta2jxYbr69evk4MHDyY7d+7MjhMnTiTPnz9Pfv/+nfQT2ggpO2dMF8cghuoM7Ygj5iWCqRlGFml0QC/ftGmTmzt3rmsaKDsgBSPh0/8yPeLLBihLkOKJc0jp8H8vUzcxIA1k6QJ/c78tWEyj5P3o4u9+jywNPdJi5rAH9x0KHcl4Hg570eQp3+vHXGyrmEeigzQsQsjavXt38ujRo44LQuDDhw+TW7duRS1HGgMxhNXHgflaNTOsHyKvHK5Ijo2jbFjJBQK9YwFd6RVMzfgRBmEfP37suBBm/p49e1qjEP2mwTViNRo0VJWH1deMXcNK08uUjVUu7s/zRaL+oLNxz1bpANco4npUgX4G2eFbpDFyQoQxojBCpEGSytmOH8qrH5Q9vuzD6ofQylkCUmh8DBAr+q8JCyVNtWQIidKQE9wNtLSQnS4jDSsxNHogzFuQBw4cyM61UKVsjfr3ooBkPSqqQHesUPWVtzi9/vQi1T+rJj7WiTz4Pt/l3LxUkr5P2VYZaZ4URpsE+st/dujQoaBBYokbrz/8TJNQYLSonrPS9kUaSkPeZyj1AWSj+d+VBoy1pIWVNed8P0Ll/ee5HdGRhrHhR5GGN0r4LGZBaj8oFDJitBTJzIZgFcmU0Y8ytWMZMzJOaXUSrUs5RxKnrxmbb5YXO9VGUhtpXldhEUogFr3IzIsvlpmdosVcGVGXFWp2oU9kLFL3dEkSz6NHEY1sjSRdIuDFWEhd8KxFqsRi1uM/nz9/zpxnwlESONdg6dKlbsaMGS4EHFHtjFIDHwKOo46l4TxSuxgDzi+rE2jg+BaFruOX4HXa0Nnf1lwAPufZeF8/r6zD97WK2qFnGjBxTw5qNGPxT+5T/r7/7RawFC3j4vTp09koCxkeHjqbHJqArmH5UrFKKksnxrK7FuRIs8STfBZv+luugXZ2pR/pP9Ois4z+TiMzUUkUjD0iEi1fzX8GmXyuxUBRcaUfykV0YZnlJGKQpOiGB76x5GeWkWWJc3mOrK6S7xdND+W5N6XyaRgtWJFe13GkaZnKOsYqGdOVVVbGupsyA/l7emTLHi7vwTdirNEt0qxnzAvBFcnQF16xh/TMpUuXHDowhlA9vQVraQhkudRdzOnK+04ZSP3DUhVSP61YsaLtd/ks7ZgtPcXqPqEafHkdqa84X6aCeL7YWlv6edGFHb+ZFICPlljHhg0bKuk0CSvVznWsotRu433alNdFrqG45ejoaPCaUkWERpLXjzFL2Rpllp7PJU2a/v7Ab8N05/9t27Z16KUqoFGsxnI9EosS2niSYg9SpU6B4JgTrvVW1flt1sT+0ADIJU2maXzcUTraGCRaL1Wp9rUMk16PMom8QhruxzvZIegJjFU7LLCePfS8uaQdPny4jTTL0dbee5mYokQsXTIWNY46kuMbnt8Kmec+LGWtOVIl9cT1rCB0V8WqkjAsRwta93TbwNYoGKsUSChN44lgBNCoHLHzquYKrU6qZ8lolCIN0Rh6cP0Q3U6I6IXILYOQI513hJaSKAorFpuHXJNfVlpRtmYBk1Su1obZr5dnKAO+L10Hrj3WZW+E3qh6IszE37F6EB+68mGpvKm4eb9bFrlzrok7fvr0Kfv727dvWRmdVTJHw0qiiCUSZ6wCK+7XL/AcsgNyL74DQQ730sv78Su7+t/A36MdY0sW5o40ahslXr58aZ5HtZB8GH64m9EmMZ7FpYw4T6QnrZfgenrhFxaSiSGXtPnz57e9TkNZLvTjeqhr734CNtrK41L40sUQckmj1lGKQ0rC37x544r8eNXRpnVE3ZZY7zXo8NomiO0ZUCj2uHz58rbXoZ6gc0uA+F6ZeKS/jhRDUq8MKrTho9fEkihMmhxtBI1DxKFY9XLpVcSkfoi8JGnToZO5sU5aiDQIW716ddt7ZLYtMQlhECdBGXZZMWldY5BHm5xgAroWj4C0hbYkSc/jBmggIrXJWlZM6pSETsEPGqZOndr2uuuR5rF169a2HoHPdurUKZM4CO1WTPqaDaAd+GFGKdIQkxAn9RuEWcTRyN2KSUgiSgF5aWzPTeA/lN5rZubMmR2bE4SIC4nJoltgAV/dVefZm72AtctUCJU2CMJ327hxY9t7EHbkyJFseq+EJSY16RPo3Dkq1kkr7+q0bNmyDuLQcZBEPYmHVdOBiJyIlrRDq41YPWfXOxUysi5fvtyaj+2BpcnsUV/oSoEMOk2CQGlr4ckhBwaetBhjCwH0ZHtJROPJkyc7UjcYLDjmrH7ADTEBXFfOYmB0k9oYBOjJ8b4aOYSe7QkKcYhFlq3QYLQhSidNmtS2RATwy8YOM3EQJsUjKiaWZ+vZToUQgzhkHXudb/PW5YMHD9yZM2faPsMwoc7RciYJXbGuBqJ1UIGKKLv915jsvgtJxCZDubdXr165mzdvtr1Hz5LONA8jrUwKPqsmVesKa49S3Q4WxmRPUEYdTjgiUcfUwLx589ySJUva3oMkP6IYddq6HMS4o55xBJBUeRjzfa4Zdeg56QZ43LhxoyPo7Lf1kNt7oO8wWAbNwaYjIv5lhyS7kRf96dvm5Jah8vfvX3flyhX35cuX6HfzFHOToS1H4BenCaHvO8pr8iDuwoUL7tevX+b5ZdbBair0xkFIlFDlW4ZknEClsp/TzXyAKVOmmHWFVSbDNw1l1+4f90U6IY/q4V27dpnE9bJ+v87QEydjqx/UamVVPRG+mwkNTYN+9tjkwzEx+atCm/X9WvWtDtAb68Wy9LXa1UmvCDDIpPkyOQ5ZwSzJ4jMrvFcr0rSjOUh+GcT4LSg5ugkW1Io0/SCDQBojh0hPlaJdah+tkVYrnTZowP8iq1F1TgMBBauufyB33x1v+NWFYmT5KmppgHC+NkAgbmRkpD3yn9QIseXymoTQFGQmIOKTxiZIWpvAatenVqRVXf2nTrAWMsPnKrMZHz6bJq5jvce6QK8J1cQNgKxlJapMPdZSR64/UivS9NztpkVEdKcrs5alhhWP9NeqlfWopzhZScI6QxseegZRGeg5a8C3Re1Mfl1ScP36ddcUaMuv24iOJtz7sbUjTS4qBvKmstYJoUauiuD3k5qhyr7QdUHMeCgLa1Ear9NquemdXgmum4fvJ6w1lqsuDhNrg1qSpleJK7K3TF0Q2jSd94uSZ60kK1e3qyVpQK6PVWXp2/FC3mp6jBhKKOiY2h3gtUV64TWM6wDETRPLDfSakXmH3w8g9Jlug8ZtTt4kVF0kLUYYmCCtD/DrQ5YhMGbA9L3ucdjh0y8kOHW5gU/VEEmJTcL4Pz/f7mgoAbYkAAAAAElFTkSuQmCC"]
    }
  ]
}'

but I get the response like this:

{"model":"mini-cpm-v2","created_at":"2024-05-24T06:35:43.745508672Z","message":{"role":"assistant","content":"As an AI language model, I cannot see images directly. However, if you provide me with a description of the image or its content, I can try to assist you with your query."},"done_reason":"stop","done":true,"total_duration":12474661490,"load_duration":8495227,"prompt_eval_duration":329466000,"eval_count":39,"eval_duration":12132736000}

The MiniCPM-Llama3-V-2_5-gguf is a Multimodal model, It's seems not work correctly.

Cuiunbo commented 3 months ago

可否出一個 F16的 gguf格式,Q4的量化 沒有吃滿顯存。

我們嘗試按照readme轉換一了下,發現效果也不是很好,還請指正。

readme

image

image

It looks like you're using ollama instead of the direct llama.cpp? The current way of ollama should not be able to accept the image features, you can wait for our modification or you are welcome to implement it yourself~!

seasoncool commented 3 months ago

@Cuiunbo thanks for your response, we will test with llama.cpp and reply the result later.

Cuiunbo commented 3 months ago

@seasoncool https://github.com/OpenBMB/llama.cpp Please use our fork, the offical llama.cpp has not merge our PR

yuanjie-ai commented 3 months ago

有没有教程

---原始邮件--- 发件人: @.> 发送时间: 2024年5月25日(周六) 上午10:54 收件人: @.>; 抄送: @.**@.>; 主题: Re: [OpenBMB/MiniCPM-V] ollama支持吗 (Issue #97)

MiniCPM-Llama3-V 2.5 can run with llama.cpp now! See our fork of llama.cpp for more detail.

and here is our model in gguf format. https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5-gguf @duanshuaimin @leeaction @tyzero @hexf00

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

Cuiunbo commented 3 months ago

https://github.com/OpenBMB/ollama/tree/minicpm-v2.5/examples/minicpm-v2.5 @yuanjie-ai

chrischjh commented 3 months ago

@Cuiunbo Flowing this guide, deployed successfully, but got error when trying to interact with it. Intel chip MAC. CleanShot 2024-05-25 at 20 37 28@2x

seasoncool commented 3 months ago

@Cuiunbo thanks for your response, we will test with llama.cpp and reply the result later.

I have tested two scenarios multiple times, and the text recognition rate is not very good.

iShot_2024-05-26_10 08 40 iShot_2024-05-26_10 02 13
Cuiunbo commented 3 months ago

We notice some reported issues from compromising MiniCPM-Llama3-V 2.5's adaptive visual encoding with Ollama & Llama.cpp's vanilla fixed encoding implementation. We are reimplementing this part for Ollama & Llama.cpp to fully support MiniCPM-Llama3-V 2.5's feature and fix the issue. This update will hopefully be available within a day. Please stay tuned! @seasoncool

We-IOT commented 3 months ago

ollama加载minicpm-v2.5报错: Error: llama runner process has terminated: signal: aborted (core dumped)

Screenshot 2024-05-26 at 17 49 42

重新编译了 下载openBMB fork的ollama和llama.cpp,重新编译后,可以使用了,但是 ollama不能使用以前的模型,另外也pull不下来,总在100%时候出错。

pxz2016 commented 3 months ago

go build . 报错 build github.com/ollama/ollama: cannot load cmp: malformed module path "cmp": missing dot in first path element

pxz2016 commented 3 months ago

go版本的问题,现在编译成功了。但是运行的时候报错: ./ollama run minicpm-v2.5

你好 Error: an unknown error was encountered while running the model

pxz2016 commented 3 months ago

@Cuiunbo Flowing this guide, deployed successfully, but got error when trying to interact with it. Intel chip MAC. CleanShot 2024-05-25 at 20 37 28@2x

@chrischjh 碰到了同样的问题,你的解决了吗?Ubuntu 20.04.6

chrischjh commented 3 months ago

@pxz2016 没解决,那看来不是我MAC电脑的问题,确实是个问题。我在安装编译的过程中没有报任何错。

Cuiunbo commented 3 months ago

We've noticed that some people are having trouble using our fork of ollma or llamacpp. We'll be releasing an FAQ and a multi-image tutorial this week~

Hazard-Mount commented 3 months ago

My MiniCPM-V2.5 can run with the model file example https://github.com/OpenBMB/ollama/blob/minicpm-v2.5/examples/minicpm-v2.5/Modelfile it can answer the question initially, but it then keeps asking and answering itself and doesn't automatically stop and end the answer.

yuanjie-ai commented 3 months ago

是不是有个停止符

---原始邮件--- 发件人: "Hongrui @.> 发送时间: 2024年5月27日(周一) 中午11:17 收件人: @.>; 抄送: @.**@.>; 主题: Re: [OpenBMB/MiniCPM-V] ollama支持吗 (Issue #97)

My MiniCPM-V2.5 can run with the model file example (https://github.com/OpenBMB/ollama/blob/minicpm-v2.5/examples/minicpm-v2.5/Modelfile), it can answer my question initially, but it then keeps asking and answering itself and doesn't automatically stop and end the answer. How should I fix this problem?

我的 MiniCPM-V2.5 可以运行模型文件示例 (https://github.com/OpenBMB/ollama/blob/minicpm-v2.5/examples/minicpm-v2.5/Modelfile),它最初可以回答我的问题,但随后会不断自问自答,不会自动停止并结束回答。应该如何解决这个问题

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

LiHtCocoa commented 3 months ago

image 出现以上错误,过程:我尝试在windows11使用docker来安装,使用的docker镜像为openwebui和ollama的整合: docker run -d -p 3000:8080 --gpus=all -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama 随后我直接运行了 这里 所示Running部分安装模型。 The above error occurs, process: I tried to use docker in windows 11 I tried to install it on windows 11 using docker, using a docker image that integrates openwebui and ollama: docker run -d -p 3000:8080 --gpus=all -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama I then ran it directly here the Running portion of the installation model as shown.

tc-mb commented 3 months ago

@Cuiunbo Flowing this guide, deployed successfully, but got error when trying to interact with it. Intel chip MAC. CleanShot 2024-05-25 at 20 37 28@2x

I noticed that you are still using 'ollama' here. Have you opened serve with the ollama of the new build? The problem seems to be that what is actually running is still the original official ollama on the machine, not our fork version. This difference will cause the model to fail.

tc-mb commented 3 months ago

go版本的问题,现在编译成功了。但是运行的时候报错: ./ollama run minicpm-v2.5

你好 Error: an unknown error was encountered while running the model

你好,可以确认一下在执行这个命令的时候,开启我们版本的ollama server么? 就是readme里面的这条命令,通常需要单开一个terminal执行这条命令作为ollama服务器使用。 ./ollama serve 如果已经是这样运行的,但仍然报错,可以随时在issue里面反馈。 另外,也请pull最新的代码并记得修改modelfile,我有更改原来的顺序。

leeaction commented 3 months ago

We've noticed that some people are having trouble using our fork of ollma or llamacpp. We'll be releasing an FAQ and a multi-image tutorial this week~

Hi, I sawed the information in readme that already fully supported on Ollama, Do you guys have milestone when the PR can be Merged to the Ollama main trees?

LiHtCocoa commented 3 months ago

go版本的问题,现在编译成功了。但是运行的时候报错: ./ollama run minicpm-v2.5

你好 Error: an unknown error was encountered while running the model

你好,可以确认一下在执行这个命令的时候,开启我们版本的ollama server么? 就是readme里面的这条命令,通常需要单开一个terminal执行这条命令作为ollama服务器使用。 ./ollama serve 如果已经是这样运行的,但仍然报错,可以随时在issue里面反馈。 另外,也请pull最新的代码并记得修改modelfile,我有更改原来的顺序。

这是完整的控制台输出:

# ollama rm minicpm-v2.5
deleted 'minicpm-v2.5'
# ollama list
NAME            ID              SIZE    MODIFIED       
qwen:latest     d53d04290064    2.3 GB  12 minutes ago
qwen:4b-chat    d53d04290064    2.3 GB  23 hours ago  
# more Modelfile
FROM /home/llm/models/ggml-model-Q4_K_M.gguf
FROM /home/llm/models/mmproj-model-f16.gguf
TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|>
{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>
{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>
{{ .Response }}<|eot_id|>"""
PARAMETER stop "<|start_header_id|>"
PARAMETER stop "<|end_header_id|>"
PARAMETER stop "<|eot_id|>"
PARAMETER num_keep 4
PARAMETER num_ctx 2048
# ollama list
NAME            ID              SIZE    MODIFIED       
qwen:latest     d53d04290064    2.3 GB  12 minutes ago
qwen:4b-chat    d53d04290064    2.3 GB  23 hours ago  
# ls -l $PWD/*
-rwxr-xr-x 1 root root        502 May 27 13:43 /home/llm/models/Modelfile
-rwxr-xr-x 1 root root 4921246752 May 27 10:06 /home/llm/models/ggml-model-Q4_K_M.gguf
-rwxr-xr-x 1 root root 1032132992 May 27 13:09 /home/llm/models/mmproj-model-f16.gguf
# ollama create minicpm-v2.5 -f Modelfile
transferring model data 
using existing layer sha256:010ec3ba94cb5ad2d9c8f95f46f01c6d80f83deab9df0a0831334ea45afff3e2 
using existing layer sha256:391d11736c3cd24a90417c47b0c88975e86918fcddb1b00494c4d715b08af13e 
creating new layer sha256:8ab4849b038cf0abc5b1c9b8ee1443dca6b93a045c2272180d985126eb40bf6f 
creating new layer sha256:2c527a8fcba5865389ee84c9f4d34ed4bb2370d40da78e7fe27dfb3046793997 
creating new layer sha256:d269859f94d0a3ebb6f5bc1d56a3d7130e4f30585cdbb4b475863ddd26bc72c3 
writing manifest 
success 
# ollama run minicpm-v2.5
Error: llama runner process has terminated: signal: aborted 

我安装了open-webui:ollama镜像,并且确定ollama能够通过ollama运行qwen模型之后,从gguf文件如上所示开始构建minicpm-v2.5模型,没有进行其他步骤。 希望这能够展示足够的信息帮助解决问题。


上面好像是和上一条回复一样的问题,我并没有使用fork版本的ollama,但这意味着我不能使用docker镜像部署这个项目。我在windows上的编译失败了,如果可以的话希望能够提供docker支持的部署方式。 windows部署错误信息如下:

PS D:\AI\LLM\ollama> go generate ./...
Already on 'minicpm-v2.5'
Your branch is up to date with 'origin/minicpm-v2.5'.
Submodule path '../llama.cpp': checked out 'd8974b8ea61e1268a4cad27f4f6e2cde3c5d1370'
Checking for MinGW...

CommandType     Name                                               Version    Source
-----------     ----                                               -------    ------
Application     gcc.exe                                            0.0.0.0    D:\Tools\mingw64\bin\gcc.exe
Application     mingw32-make.exe                                   0.0.0.0    D:\Tools\mingw64\bin\mingw32-make.exe
Building static library
generating config with: cmake -S ../llama.cpp -B ../build/windows/amd64_static -G MinGW Makefiles -DCMAKE_C_COMPILER=gcc.exe -DCMAKE_CXX_COMPILER=g++.exe -DBUILD_SHARED_LIBS=off -DLLAMA_NATIVE=off -DLLAMA_AVX=off -DLLAMA_AVX2=off -DLLAMA_AVX512=off -DLLAMA_F16C=off -DLLAMA_FMA=off
cmake version 3.29.3

CMake suite maintained and supported by Kitware (kitware.com/cmake).
-- ccache found, compilation results will be cached. Disable with LLAMA_CCACHE=OFF.
-- CMAKE_SYSTEM_PROCESSOR: AMD64
-- x86 detected
-- Configuring done (0.2s)
-- Generating done (1.5s)
-- Build files have been written to: D:/AI/LLM/ollama/llm/build/windows/amd64_static
building with: cmake --build ../build/windows/amd64_static --config Release --target llama --target ggml
[ 50%] Built target ggml
[100%] Built target llama
[100%] Built target ggml
Building LCD CPU
generating config with: cmake -S ../llama.cpp -B ../build/windows/amd64/cpu -DCMAKE_POSITION_INDEPENDENT_CODE=on -A x64 -DLLAMA_AVX=off -DLLAMA_AVX2=off -DLLAMA_AVX512=off -DLLAMA_FMA=off -DLLAMA_F16C=off -DBUILD_SHARED_LIBS=on -DLLAMA_NATIVE=off -DLLAMA_SERVER_VERBOSE=off -DCMAKE_BUILD_TYPE=Release
cmake version 3.29.3

CMake suite maintained and supported by Kitware (kitware.com/cmake).
CMake Error at CMakeLists.txt:2 (project):
  Generator

    Ninja

  does not support platform specification, but platform

    x64

  was specified.

CMake Error: CMAKE_C_COMPILER not set, after EnableLanguage
CMake Error: CMAKE_CXX_COMPILER not set, after EnableLanguage
-- Configuring incomplete, errors occurred!
llm\generate\generate_windows.go:3: running "powershell": exit status 1

我所使用的mingw64版本为 x86_64-8.1.0-release-posix-seh-rt_v6-rev0.7z 。 我尝试了如下方式解决均不起作用:

  1. 使用cmake gui将configure改为MinGW Makefiles
  2. 删除目录中的build文件后重新运行go generate ./...
  3. 更新gcc版本至14.1.0
chrischjh commented 3 months ago

@tc-mb 还是一样的,使用./ollama serve./ollama run minicpm-v2.5

CleanShot 2024-05-28 at 19 13 54@2x

chrischjh commented 3 months ago

按照 https://github.com/OpenBMB/llama.cpp/blob/minicpm-v2.5/examples/minicpmv/README.md 进行编译,然后运行测试:

./minicpmv-cli -m ../MiniCPM-Llama3-V-2_5/ggml-model-f16.gguf --mmproj ../MiniCPM-Llama3-V-2_5/mmproj-model-f16.gguf -c 4096 --temp 0.7 --top-p 0.8 --top-k 100 --repeat-penalty 1.05 --image ../resources/cert.png -p "What is in the image?"

报错不支持metal,但llama.cpp官方有说明是支持metal的

CleanShot 2024-05-28 at 22 40 48@2x

pxz2016 commented 3 months ago

@pxz2016 没解决,那看来不是我MAC电脑的问题,确实是个问题。我在安装编译的过程中没有报任何错。

pull最新代码,已经可以正常运行了

pxz2016 commented 3 months ago

go版本的问题,现在编译成功了。但是运行的时候报错: ./ollama run minicpm-v2.5

你好 Error: an unknown error was encountered while running the model

你好,可以确认一下在执行这个命令的时候,开启我们版本的ollama server么? 就是readme里面的这条命令,通常需要单开一个terminal执行这条命令作为ollama服务器使用。 ./ollama serve 如果已经是这样运行的,但仍然报错,可以随时在issue里面反馈。 另外,也请pull最新的代码并记得修改modelfile,我有更改原来的顺序。

pull最新版本已经可以正常运行了,谢谢

tc-mb commented 3 months ago

We've noticed that some people are having trouble using our fork of ollma or llamacpp. We'll be releasing an FAQ and a multi-image tutorial this week~

Hi, I sawed the information in readme that already fully supported on Ollama, Do you guys have milestone when the PR can be Merged to the Ollama main trees?

Ollama relies on llama.cpp, and we submitted a PR that supports minicpm-v2.5 to llama.cpp a few hours ago, and we will submit PR to ollama in the next few days.

tc-mb commented 3 months ago

go版本的问题,现在编译成功了。但是运行的时候报错: ./ollama run minicpm-v2.5

你好 Error: an unknown error was encountered while running the model

你好,可以确认一下在执行这个命令的时候,开启我们版本的ollama server么? 就是readme里面的这条命令,通常需要单开一个terminal执行这条命令作为ollama服务器使用。 ./ollama serve 如果已经是这样运行的,但仍然报错,可以随时在issue里面反馈。 另外,也请pull最新的代码并记得修改modelfile,我有更改原来的顺序。

这是完整的控制台输出:

# ollama rm minicpm-v2.5
deleted 'minicpm-v2.5'
# ollama list
NAME            ID              SIZE    MODIFIED       
qwen:latest     d53d04290064    2.3 GB  12 minutes ago
qwen:4b-chat    d53d04290064    2.3 GB  23 hours ago  
# more Modelfile
FROM /home/llm/models/ggml-model-Q4_K_M.gguf
FROM /home/llm/models/mmproj-model-f16.gguf
TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|>
{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>
{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>
{{ .Response }}<|eot_id|>"""
PARAMETER stop "<|start_header_id|>"
PARAMETER stop "<|end_header_id|>"
PARAMETER stop "<|eot_id|>"
PARAMETER num_keep 4
PARAMETER num_ctx 2048
# ollama list
NAME            ID              SIZE    MODIFIED       
qwen:latest     d53d04290064    2.3 GB  12 minutes ago
qwen:4b-chat    d53d04290064    2.3 GB  23 hours ago  
# ls -l $PWD/*
-rwxr-xr-x 1 root root        502 May 27 13:43 /home/llm/models/Modelfile
-rwxr-xr-x 1 root root 4921246752 May 27 10:06 /home/llm/models/ggml-model-Q4_K_M.gguf
-rwxr-xr-x 1 root root 1032132992 May 27 13:09 /home/llm/models/mmproj-model-f16.gguf
# ollama create minicpm-v2.5 -f Modelfile
transferring model data 
using existing layer sha256:010ec3ba94cb5ad2d9c8f95f46f01c6d80f83deab9df0a0831334ea45afff3e2 
using existing layer sha256:391d11736c3cd24a90417c47b0c88975e86918fcddb1b00494c4d715b08af13e 
creating new layer sha256:8ab4849b038cf0abc5b1c9b8ee1443dca6b93a045c2272180d985126eb40bf6f 
creating new layer sha256:2c527a8fcba5865389ee84c9f4d34ed4bb2370d40da78e7fe27dfb3046793997 
creating new layer sha256:d269859f94d0a3ebb6f5bc1d56a3d7130e4f30585cdbb4b475863ddd26bc72c3 
writing manifest 
success 
# ollama run minicpm-v2.5
Error: llama runner process has terminated: signal: aborted 

我安装了open-webui:ollama镜像,并且确定ollama能够通过ollama运行qwen模型之后,从gguf文件如上所示开始构建minicpm-v2.5模型,没有进行其他步骤。 希望这能够展示足够的信息帮助解决问题。

上面好像是和上一条回复一样的问题,我并没有使用fork版本的ollama,但这意味着我不能使用docker镜像部署这个项目。我在windows上的编译失败了,如果可以的话希望能够提供docker支持的部署方式。 windows部署错误信息如下:

PS D:\AI\LLM\ollama> go generate ./...
Already on 'minicpm-v2.5'
Your branch is up to date with 'origin/minicpm-v2.5'.
Submodule path '../llama.cpp': checked out 'd8974b8ea61e1268a4cad27f4f6e2cde3c5d1370'
Checking for MinGW...

CommandType     Name                                               Version    Source
-----------     ----                                               -------    ------
Application     gcc.exe                                            0.0.0.0    D:\Tools\mingw64\bin\gcc.exe
Application     mingw32-make.exe                                   0.0.0.0    D:\Tools\mingw64\bin\mingw32-make.exe
Building static library
generating config with: cmake -S ../llama.cpp -B ../build/windows/amd64_static -G MinGW Makefiles -DCMAKE_C_COMPILER=gcc.exe -DCMAKE_CXX_COMPILER=g++.exe -DBUILD_SHARED_LIBS=off -DLLAMA_NATIVE=off -DLLAMA_AVX=off -DLLAMA_AVX2=off -DLLAMA_AVX512=off -DLLAMA_F16C=off -DLLAMA_FMA=off
cmake version 3.29.3

CMake suite maintained and supported by Kitware (kitware.com/cmake).
-- ccache found, compilation results will be cached. Disable with LLAMA_CCACHE=OFF.
-- CMAKE_SYSTEM_PROCESSOR: AMD64
-- x86 detected
-- Configuring done (0.2s)
-- Generating done (1.5s)
-- Build files have been written to: D:/AI/LLM/ollama/llm/build/windows/amd64_static
building with: cmake --build ../build/windows/amd64_static --config Release --target llama --target ggml
[ 50%] Built target ggml
[100%] Built target llama
[100%] Built target ggml
Building LCD CPU
generating config with: cmake -S ../llama.cpp -B ../build/windows/amd64/cpu -DCMAKE_POSITION_INDEPENDENT_CODE=on -A x64 -DLLAMA_AVX=off -DLLAMA_AVX2=off -DLLAMA_AVX512=off -DLLAMA_FMA=off -DLLAMA_F16C=off -DBUILD_SHARED_LIBS=on -DLLAMA_NATIVE=off -DLLAMA_SERVER_VERBOSE=off -DCMAKE_BUILD_TYPE=Release
cmake version 3.29.3

CMake suite maintained and supported by Kitware (kitware.com/cmake).
CMake Error at CMakeLists.txt:2 (project):
  Generator

    Ninja

  does not support platform specification, but platform

    x64

  was specified.

CMake Error: CMAKE_C_COMPILER not set, after EnableLanguage
CMake Error: CMAKE_CXX_COMPILER not set, after EnableLanguage
-- Configuring incomplete, errors occurred!
llm\generate\generate_windows.go:3: running "powershell": exit status 1

我所使用的mingw64版本为 x86_64-8.1.0-release-posix-seh-rt_v6-rev0.7z 。 我尝试了如下方式解决均不起作用:

  1. 使用cmake gui将configure改为MinGW Makefiles
  2. 删除目录中的build文件后重新运行go generate ./...
  3. 更新gcc版本至14.1.0

很抱歉,暂时还只能用我们fork出来的代码来支持minicpm-v2.5。 我们已经在几小时前向llama.cpp提交了支持minicpm-v2.5的PR,也会在最近几天向ollama提交PR。

tc-mb commented 3 months ago

按照 https://github.com/OpenBMB/llama.cpp/blob/minicpm-v2.5/examples/minicpmv/README.md 进行编译,然后运行测试:

./minicpmv-cli -m ../MiniCPM-Llama3-V-2_5/ggml-model-f16.gguf --mmproj ../MiniCPM-Llama3-V-2_5/mmproj-model-f16.gguf -c 4096 --temp 0.7 --top-p 0.8 --top-k 100 --repeat-penalty 1.05 --image ../resources/cert.png -p "What is in the image?"

报错不支持metal,但llama.cpp官方有说明是支持metal的

CleanShot 2024-05-28 at 22 40 48@2x

感谢反馈,这个问题可能是我fork出来的时候llama.cpp版本还不支持。

如果你想继续尝试,可以试一下我为提PR代码做的这个分支,这里pull了官方的最新的代码。https://github.com/OpenBMB/llama.cpp/tree/prepare-PR-of-minicpm-v2.5

当然,你也可以稍等我们的pr合并之后使用官方版本,我们提交的PR路径如下。 https://github.com/ggerganov/llama.cpp/pull/7599

tc-mb commented 3 months ago

go版本的问题,现在编译成功了。但是运行的时候报错: ./ollama run minicpm-v2.5

你好 Error: an unknown error was encountered while running the model

你好,可以确认一下在执行这个命令的时候,开启我们版本的ollama server么? 就是readme里面的这条命令,通常需要单开一个terminal执行这条命令作为ollama服务器使用。 ./ollama serve 如果已经是这样运行的,但仍然报错,可以随时在issue里面反馈。 另外,也请pull最新的代码并记得修改modelfile,我有更改原来的顺序。

pull最新版本已经可以正常运行了,谢谢

嗯嗯,有其他问题可以继续提issue给我,我会尽快回复。^_^

chrischjh commented 3 months ago

ollama可以了,但是不能正确输出识别结果,我用的ggml-model-Q4_K_M.gguf。在huggingface上面测试可以。

CleanShot 2024-05-29 at 16 14 31@2x

tc-mb commented 3 months ago

ollama可以了,但是不能正确输出识别结果,我用的ggml-model-Q4_K_M.gguf。在huggingface上面测试可以。

CleanShot 2024-05-29 at 16 14 31@2x

我直接截了一张小图来测试,我本地(mac环境)的结果看起来算是正常的。 而且即使回答有错漏应该也不至于没有回复或者乱码。 所以或许应该是设别环境差异或者其他组件造成的编解码问题? 如果方便的话,可以发下你的使用机器或环境么?

9296b6f2-0d5b-4946-a60d-58cef1488e74

chrischjh commented 3 months ago

@tc-mb 应该是open-webui的问题,不知道是哪里对这些套壳支持的不好,换了其他的套壳也是各种问题,时灵时不灵,有时候有结果返回,有时候直接就没返回,很影响测试。能否告诉我./ollama run test的具体用法(ollama文档好像查不到),我试了一下,会报错:

CleanShot 2024-05-29 at 18 25 40@2x