Closed mann1x closed 5 months ago
Upon investigating it seems that ollama expects a slightly different format for the base64 image data. Ultimately I think ollama should fix this to better align with the OpenAI API format. I'll try to raise more awareness on this.
For now it should work if you manually change this line in your llmcord.py: https://github.com/jakobdylanc/discord-llm-chatbot/blob/4f6971032c9346caa332eeabfc9228b42e342138/llmcord.py#L117 to this:
"image_url": {"url": base64.b64encode(requests.get(att.url).content).decode('utf-8')},
Upon investigating it seems that ollama expects a slightly different format for the base64 image data. Ultimately I think ollama should fix this to better align with the OpenAI API format. I'll try to raise more awareness on this.
I could make a PR to ollama to fix it. But I struggle to find where is the OpenAPI format... Do you have any reference?
I only find about the "Create image" but nothing about what is expected when sending images. https://platform.openai.com/docs/api-reference/images/createVariation
This is what you're looking for: https://platform.openai.com/docs/guides/vision
The issue is with the base64 data in the "url" field. OpenAI API expects, for example:
"url": f"data:image/jpeg;base64,{base64_image}"
But ollama expects JUST the base64 data without any prefix info:
"url": base64_image
Ollama's "default" API endpoints (/api/generate
and /api/chat
) are not OpenAI compatible:
https://github.com/ollama/ollama/blob/main/docs/api.md
They're working on an OpenAI compatible endpoint (/v1/chat/completions
) but it doesn't support vision yet:
https://github.com/ollama/ollama/blob/main/docs/openai.md#supported-features
Once it supports vision, it should fix this problem.
Ok so this is a hack to support the ollama way using the OpenAI interface How I could miss it... was literally the next chapter, sorry and thanks
No problem! Hopefully ollama addresses this soon. For now I'll keep this issue open for awareness.
Did you get it working with the code change I suggested before?
Oh yes sure, works perfectly!
This issue should now be fixed with the latest version of litellm: pip install -U litellm
@jakobdylanc
Can confirm it's working with ollama/llava-phi3.
I just had to also execute pip install Pillow
@mann1x were you getting some kind of error before doing pip install Pillow
? This sounds like another litellm issue. If litellm depends on Pillow shouldn't it install it for you when you do pip install -U litellm
?
@jakobdylanc Yes it was an error from litellm, not installed with the upgrade. Probably optional cause maybe only needed to process images
2024-05-29 16:11:37,798 ERROR: Error while streaming response
Traceback (most recent call last):
File "/usr/local/lib/python3.11/dist-packages/litellm/llms/ollama.py", line 129, in _convert_image
from PIL import Image
ModuleNotFoundError: No module named 'PIL'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.11/dist-packages/litellm/main.py", line 2213, in completion
generator = ollama.get_ollama_response(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/litellm/llms/ollama.py", line 186, in get_ollama_response
data["images"] = [_convert_image(image) for image in images]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/litellm/llms/ollama.py", line 186, in <listcomp>
data["images"] = [_convert_image(image) for image in images]
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/litellm/llms/ollama.py", line 131, in _convert_image
raise Exception(
Exception: ollama image conversion failed please run `pip install Pillow`
`
Ahh I see, thanks for sharing the error. It indeed looks like they intentionally made it optional. I found the code: https://github.com/BerriAI/litellm/blob/c44970c8134174a05b433b968ab8f46eee5a67a8/litellm/llms/ollama.py#L128-L133
Is sending images to the bot supported using ollama?
This is what I get from the logs using llava model: