Issue with Multi Modal Requests. Inconsistent behavior

nijavidi commented 1 month ago

I hope you're well. Please share your valuable insights. I really appreciate it.

I have a python function that sends images to lm studio and it works good when it is called from a tester code. But the same function fails, when I call that function from a bigger code. (The images are almost the same in both cases) In the lms logs, the working and not working requests look the same. Please see the detailed logs attached. image_analyzer_agent-NOT-WORKING.log image_analyzer_agent-WORKING.log

The only difference I can see is this line in the not working scenario is this line: 2024-10-06 09:31:54 [DEBUG] Failed to find image for token at index 22

Do you know what could cause this? "[DEBUG] Failed to find image for token at index 22"

NOT WORKING:

2024-10-06 09:31:51 [INFO] Received POST request to /v1/chat/completions with body: { "model": "lmstudio-community/MiniCPM-V-2_6-GGUF/MiniCPM-V-2_6-Q8_0.gguf", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "What text is in this image? Summarize it and Do NOT include information about advertisement(s)" }, { "type": "image_url", "image_url": { "url": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAABN0AAASWCAIAAACRr01zAAAAAXNSR0IArs4c6QAAIABJREFUeJzs3X... (truncated in these logs)" } } ] } ], "temperature": 0.7, "max_tokens": 500, "stream": false } 2024-10-06 09:31:51 [INFO] [LM STUDIO SERVER] Running chat completion on conversation with 1 messages. 2024-10-06 09:31:51 [DEBUG] sampling: repeat_last_n = 64, repeat_penalty = 1.100, frequency_pena 2024-10-06 09:31:51 [DEBUG] lty = 0.000, presence_penalty = 0.000 top_k = 40, tfs_z = 1.000, top_p = 0.950, min_p = 0.050, typical_p = 1.000, temp = 0.700 mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000 sampling order: CFG -> Penalties -> top_k -> tfs_z -> typical_p -> top_p -> min_p -> temperature generate: n_ctx = 32768, n_batch = 512, n_predict = 500, n_keep = 45 2024-10-06 09:31:54 [DEBUG] Failed to find image for token at index 22

LMS Logs:

WORKING: timestamp: 10/6/2024, 9:52:20 AM type: llm.prediction.input modelIdentifier: lmstudio-community/MiniCPM-V-2_6-GGUF/MiniCPM-V-2_6-Q8_0.gguf modelPath: lmstudio-community/MiniCPM-V-2_6-GGUF/MiniCPM-V-2_6-Q8_0.gguf input: "<|im_start|>user What text is in this image? Summarize it and Do NOT include information about advertisement(s)[img-1]<|im_end|> <|im_start|>assistant

NOT WORKING: timestamp: 10/6/2024, 9:49:09 AM type: llm.prediction.input modelIdentifier: lmstudio-community/MiniCPM-V-2_6-GGUF/MiniCPM-V-2_6-Q8_0.gguf modelPath: lmstudio-community/MiniCPM-V-2_6-GGUF/MiniCPM-V-2_6-Q8_0.gguf input: "<|im_start|>user What text is in this image? Summarize it and Do NOT include information about advertisement(s)[img-1]<|im_end|> <|im_start|>assistant

yagil commented 1 month ago

Thanks @nijavidi . Are you able to share the image that causes the issue, assuming it's not sensitive and SFW?

nijavidi commented 1 month ago

Thank you for reviewing this.

I don't think this is a problem with lm studio. I think I found the workaround.

Background:

The code dynamically captures and saves an image.
The image is then sent to lm studio.
I added delay timer in the code to make sure there is enough delay to make sure the image is saved. I also added checks on the image as you can see in the attached log files.
Non of the above works and it returns irrelevant responses, as it seems to have a problem with the image: 2024-10-06 09:31:54 [DEBUG] Failed to find image for token at index 22

Workaround:

When I ask the code to send the second request (with exact same instructions and image), the second request always works. So as a workaround I'm sending the request twice and I ignore the first response.

lmstudio-ai / lmstudio-bug-tracker

Issue with Multi Modal Requests. Inconsistent behavior #143