matatonic / openedai-vision

An OpenAI API compatible API for chat with image input and questions about the images. aka Multimodal.
GNU Affero General Public License v3.0
148 stars 12 forks source link

simple valid base64 image results in incorrect padding error #4

Closed saket424 closed 1 month ago

saket424 commented 1 month ago

Same image works fine when using LM Studio but results in a 500 error in openedai-vision

Input

[
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "what do you see in the image?"
            },
            {
                "type": "image_url",
                "image_url": {
                    "url": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAgAAAAIAQMAAAD+wSzIAAAABlBMVEX///+/v7+jQ3Y5AAAADklEQVQI12P4AIX8EAgALgAD/aNpbtEAAAAASUVORK5CYII"
                }
            }
        ]
    }
]

Output from LM Studio {"id":"chatcmpl-m6h6759cqvl7mby9i1gz","object":"chat.completion","created":1720801907,"model":"moondream/moondream2-gguf/moondream2-text-model-f16.gguf","choices":[{"index":0,"message":{"role":"assistant","content":"In this image, there are four squares of varying sizes, all rendered in shades of gray and white. They appear to be placed side by side with some distance between them, creating a sense of depth and space within the design. The colors used for each square range from pure black to more muted tones, adding contrast and visual interest to the overall composition. This image seems to be a digital illustration or an artistic representation of an abstract pattern that could potentially be used in various mediums such as graphic design or interior decorating."},"finish_reason":"stop"}],"usage":{"prompt_tokens":42,"completion_tokens":105,"total_tokens":147}}

Output from openedai-vision

(venvlcpp) anand@hsti4090:~/openedai-vision$ docker compose up
[+] Running 2/1
 ✔ Network openedai-vision_default     Created                                                                                                                                           0.2s 
 ✔ Container openedai-vision-server-1  Created                                                                                                                                           0.1s 
Attaching to server-1
server-1  | 2024-07-12 16:47:58.053 | INFO     | __main__:<module>:143 - Loading VisionQnA[moondream2] with vikhyatk/moondream2
server-1  | Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
server-1  | You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
server-1  | 2024-07-12 16:48:00.058 | INFO     | vision_qna:loaded_banner:92 - Loaded vikhyatk/moondream2 on device: cuda:0 with dtype: torch.bfloat16
server-1  | INFO:     Started server process [7]
server-1  | INFO:     Waiting for application startup.
server-1  | INFO:     Application startup complete.
server-1  | INFO:     Uvicorn running on http://0.0.0.0:5006 (Press CTRL+C to quit)
server-1  | INFO:     172.21.0.1:39040 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
server-1  | ERROR:    Exception in ASGI application
server-1  |   + Exception Group Traceback (most recent call last):
server-1  |   |   File "/usr/local/lib/python3.11/site-packages/starlette/_utils.py", line 87, in collapse_excgroups
server-1  |   |     yield
server-1  |   |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 190, in __call__
server-1  |   |     async with anyio.create_task_group() as task_group:
server-1  |   |   File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 680, in __aexit__
server-1  |   |     raise BaseExceptionGroup(
server-1  |   | ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
server-1  |   +-+---------------- 1 ----------------
server-1  |     | Traceback (most recent call last):
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/uvicorn/protocols/http/httptools_impl.py", line 399, in run_asgi
server-1  |     |     result = await app(  # type: ignore[func-returns-value]
server-1  |     |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 70, in __call__
server-1  |     |     return await self.app(scope, receive, send)
server-1  |     |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/fastapi/applications.py", line 1054, in __call__
server-1  |     |     await super().__call__(scope, receive, send)
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/applications.py", line 123, in __call__
server-1  |     |     await self.middleware_stack(scope, receive, send)
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 186, in __call__
server-1  |     |     raise exc
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 164, in __call__
server-1  |     |     await self.app(scope, receive, _send)
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 189, in __call__
server-1  |     |     with collapse_excgroups():
server-1  |     |   File "/usr/local/lib/python3.11/contextlib.py", line 158, in __exit__
server-1  |     |     self.gen.throw(typ, value, traceback)
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/_utils.py", line 93, in collapse_excgroups
server-1  |     |     raise exc
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 191, in __call__
server-1  |     |     response = await self.dispatch_func(request, call_next)
server-1  |     |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |     |   File "/app/openedai.py", line 126, in log_requests
server-1  |     |     response = await call_next(request)
server-1  |     |                ^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 165, in call_next
server-1  |     |     raise app_exc
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 151, in coro
server-1  |     |     await self.app(scope, receive_or_disconnect, send_no_error)
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/cors.py", line 85, in __call__
server-1  |     |     await self.app(scope, receive, send)
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 65, in __call__
server-1  |     |     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
server-1  |     |     raise exc
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
server-1  |     |     await app(scope, receive, sender)
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 756, in __call__
server-1  |     |     await self.middleware_stack(scope, receive, send)
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 776, in app
server-1  |     |     await route.handle(scope, receive, send)
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 297, in handle
server-1  |     |     await self.app(scope, receive, send)
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 77, in app
server-1  |     |     await wrap_app_handling_exceptions(app, request)(scope, receive, send)
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
server-1  |     |     raise exc
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
server-1  |     |     await app(scope, receive, sender)
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 72, in app
server-1  |     |     response = await func(request)
server-1  |     |                ^^^^^^^^^^^^^^^^^^^
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 278, in app
server-1  |     |     raw_response = await run_endpoint_function(
server-1  |     |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 191, in run_endpoint_function
server-1  |     |     return await dependant.call(**values)
server-1  |     |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |     |   File "/app/vision.py", line 87, in vision_chat_completions
server-1  |     |     text = await vision_qna.chat_with_images(request)
server-1  |     |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |     |   File "/app/vision_qna.py", line 116, in chat_with_images
server-1  |     |     return ''.join([r async for r in self.stream_chat_with_images(request)])
server-1  |     |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |     |   File "/app/vision_qna.py", line 116, in <listcomp>
server-1  |     |     return ''.join([r async for r in self.stream_chat_with_images(request)])
server-1  |     |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |     |   File "/app/backend/moondream2.py", line 30, in stream_chat_with_images
server-1  |     |     images, prompt = await prompt_from_messages(request.messages, self.format)
server-1  |     |                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |     |   File "/app/vision_qna.py", line 704, in prompt_from_messages
server-1  |     |     return await known_formats[format](messages)
server-1  |     |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |     |   File "/app/vision_qna.py", line 229, in phi15_prompt_from_messages
server-1  |     |     img_data = await url_handler(c.image_url.url)
server-1  |     |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |     |   File "/app/vision_qna.py", line 178, in url_to_image
server-1  |     |     img_data = DataURI(img_url).data
server-1  |     |                ^^^^^^^^^^^^^^^^
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/datauri/__init__.py", line 85, in __new__
server-1  |     |     uri._parse  # Trigger any ValueErrors on instantiation.
server-1  |     |     ^^^^^^^^^^
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/datauri/__init__.py", line 148, in _parse
server-1  |     |     data = decode64(_data)
server-1  |     |            ^^^^^^^^^^^^^^^
server-1  |     |   File "/usr/local/lib/python3.11/base64.py", line 88, in b64decode
server-1  |     |     return binascii.a2b_base64(s, strict_mode=validate)
server-1  |     |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |     | binascii.Error: Incorrect padding
server-1  |     +------------------------------------
server-1  | 
server-1  | During handling of the above exception, another exception occurred:
server-1  | 
server-1  | Traceback (most recent call last):
server-1  |   File "/usr/local/lib/python3.11/site-packages/uvicorn/protocols/http/httptools_impl.py", line 399, in run_asgi
server-1  |     result = await app(  # type: ignore[func-returns-value]
server-1  |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |   File "/usr/local/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 70, in __call__
server-1  |     return await self.app(scope, receive, send)
server-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |   File "/usr/local/lib/python3.11/site-packages/fastapi/applications.py", line 1054, in __call__
server-1  |     await super().__call__(scope, receive, send)
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/applications.py", line 123, in __call__
server-1  |     await self.middleware_stack(scope, receive, send)
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 186, in __call__
server-1  |     raise exc
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 164, in __call__
server-1  |     await self.app(scope, receive, _send)
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 189, in __call__
server-1  |     with collapse_excgroups():
server-1  |   File "/usr/local/lib/python3.11/contextlib.py", line 158, in __exit__
server-1  |     self.gen.throw(typ, value, traceback)
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/_utils.py", line 93, in collapse_excgroups
server-1  |     raise exc
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 191, in __call__
server-1  |     response = await self.dispatch_func(request, call_next)
server-1  |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |   File "/app/openedai.py", line 126, in log_requests
server-1  |     response = await call_next(request)
server-1  |                ^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 165, in call_next
server-1  |     raise app_exc
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 151, in coro
server-1  |     await self.app(scope, receive_or_disconnect, send_no_error)
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/cors.py", line 85, in __call__
server-1  |     await self.app(scope, receive, send)
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 65, in __call__
server-1  |     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
server-1  |     raise exc
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
server-1  |     await app(scope, receive, sender)
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 756, in __call__
server-1  |     await self.middleware_stack(scope, receive, send)
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 776, in app
server-1  |     await route.handle(scope, receive, send)
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 297, in handle
server-1  |     await self.app(scope, receive, send)
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 77, in app
server-1  |     await wrap_app_handling_exceptions(app, request)(scope, receive, send)
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
server-1  |     raise exc
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
server-1  |     await app(scope, receive, sender)
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 72, in app
server-1  |     response = await func(request)
server-1  |                ^^^^^^^^^^^^^^^^^^^
server-1  |   File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 278, in app
server-1  |     raw_response = await run_endpoint_function(
server-1  |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |   File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 191, in run_endpoint_function
server-1  |     return await dependant.call(**values)
server-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |   File "/app/vision.py", line 87, in vision_chat_completions
server-1  |     text = await vision_qna.chat_with_images(request)
server-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |   File "/app/vision_qna.py", line 116, in chat_with_images
server-1  |     return ''.join([r async for r in self.stream_chat_with_images(request)])
server-1  |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |   File "/app/vision_qna.py", line 116, in <listcomp>
server-1  |     return ''.join([r async for r in self.stream_chat_with_images(request)])
server-1  |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |   File "/app/backend/moondream2.py", line 30, in stream_chat_with_images
server-1  |     images, prompt = await prompt_from_messages(request.messages, self.format)
server-1  |                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |   File "/app/vision_qna.py", line 704, in prompt_from_messages
server-1  |     return await known_formats[format](messages)
server-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |   File "/app/vision_qna.py", line 229, in phi15_prompt_from_messages
server-1  |     img_data = await url_handler(c.image_url.url)
server-1  |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |   File "/app/vision_qna.py", line 178, in url_to_image
server-1  |     img_data = DataURI(img_url).data
server-1  |                ^^^^^^^^^^^^^^^^
server-1  |   File "/usr/local/lib/python3.11/site-packages/datauri/__init__.py", line 85, in __new__
server-1  |     uri._parse  # Trigger any ValueErrors on instantiation.
server-1  |     ^^^^^^^^^^
server-1  |   File "/usr/local/lib/python3.11/site-packages/datauri/__init__.py", line 148, in _parse
server-1  |     data = decode64(_data)
server-1  |            ^^^^^^^^^^^^^^^
server-1  |   File "/usr/local/lib/python3.11/base64.py", line 88, in b64decode
server-1  |     return binascii.a2b_base64(s, strict_mode=validate)
server-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  | binascii.Error: Incorrect padding
server-1  | INFO:     172.21.0.1:39048 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
server-1  | ERROR:    Exception in ASGI application
matatonic commented 1 month ago

It does seem to have incorrect padding as indicated, base64.b64decode() is successful if you add an '=' to the end, which adds padding.

base64.b64decode('iVBORw0KGgoAAAANSUhEUgAAAAgAAAAIAQMAAAD+wSzIAAAABlBMVEX///+/v7+jQ3Y5AAAADklEQVQI12P4AIX8EAgALgAD/aNpbtEAAAAASUVORK5CYII=')

If you're sure about this please share the original image so we can compare the encoding method.

saket424 commented 1 month ago

Thanks the trailing = fixed the issue Now openedai-vision works with this small base64 encoded image in a nodered flow. thank you

{"id":"chatcmpl-1721141417","object":"chat.completion","created":1721141417,"model":"moondream2","system_fingerprint":"fp_111111111","choices":[{"index":0,"message":{"role":"assistant","content":" The image features a white background with four distinct gray squares. The squares are arranged in a square grid pattern, creating a visually appealing and symmetrical design. The gray squares are slightly darker than the white background, adding depth and contrast to the overall composition. The image does not contain any text or additional objects, and the relative positions of the squares remain constant throughout the image."},"logprobs":null,"finish_reason":"stop"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}