microsoft / promptflow

Build high-quality LLM apps - from prototyping, testing to production deployment and monitoring.
https://microsoft.github.io/promptflow/
MIT License
9.51k stars 869 forks source link

[BUG]Can't embed images in prompts with llm like gpt-4o #3668

Closed hiroki0525 closed 1 month ago

hiroki0525 commented 3 months ago

Describe the bug

Can't embed images in prompts with llm like gpt-4o, so LLM also can't answer well.

system:
You are a helpful assistant.

user:
what is this?
![image]({{image}})
スクリーンショット 2024-08-16 16 26 23
It looks like the image link you provided is broken or invalid. Therefore, I'm unable to see the image you are referring to. Could you please describe the image or try uploading it again? This way, I can assist you better!

How To Reproduce the bug

  1. Create a jinja2 file as above.
  2. Incorporate it into flow.dag.yaml.
  3. Execute pf flow test command.

Expected behavior

That request should be sent in compliance with the OpenAPI API interface like below.

    {
      "role": "user",
      "content": [
        {"type": "text", "text": "What’s in this image?"},
        {
          "type": "image_url",
          "image_url": {
            "url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
          },
        },
      ],
    }

Screenshots

Running Information(please complete the following information):

Additional context

$schema: https://azuremlschemas.azureedge.net/promptflow/latest/Flow.schema.json
inputs:
  image:
    type: object
    default:
      data:image/png;path: ./test.png
outputs:
  answer:
    type: string
    reference: ${question_image.output}
nodes:
- name: question_image
  type: llm
  source:
    type: code
    path: question_image.jinja2
  inputs:
    model: gpt-4
    image: ${inputs.image}
  connection: open_ai_connection
  api: chat
スクリーンショット 2024-08-16 21 41 07
Raw JSON ```json {"name":"openai_chat","context":{"trace_id":"0xaaca83413b61ded145cfe06c9ed440bb","span_id":"0xdbece0ef4cbd203c","trace_state":""},"kind":"1","parent_id":"0xd7e01c0042b03e4e","start_time":"2024-08-16T02:15:55.274258","end_time":"2024-08-16T02:15:56.906260","status":{"status_code":"Ok","description":""},"attributes":{"framework":"promptflow","span_type":"LLM","function":"openai.resources.chat.completions.Completions.create","node_name":"question_image","execution_target":"dag","line_run_id":"edecd614-7bc0-4c0c-ab19-49e5e7358320","inputs":"{\n \"model\": \"gpt-4o\",\n \"messages\": [\n {\n \"role\": \"system\",\n \"content\": \"You are a helpful assistant.\"\n },\n {\n \"role\": \"user\",\n \"content\": \"what is this?\\n![image](Image(fb3d2be5))\"\n }\n ],\n \"temperature\": 1.0,\n \"top_p\": 1.0,\n \"n\": 1,\n \"stream\": false,\n \"max_tokens\": null,\n \"presence_penalty\": 0.0,\n \"frequency_penalty\": 0.0,\n \"user\": \"\"\n}","llm.response.model":"gpt-4o-2024-05-13","llm.generated_message":"{\n \"content\": \"It looks like the image link you provided is broken or invalid. Therefore, I'm unable to see the image you are referring to. Could you please describe the image or try uploading it again? This way, I can assist you better!\",\n \"refusal\": null,\n \"role\": \"assistant\",\n \"function_call\": null,\n \"tool_calls\": null\n}","output":"{\n \"id\": \"chatcmpl-9wgnLZh6QXUufosCrMUm49MjQB8yd\",\n \"choices\": [\n {\n \"finish_reason\": \"stop\",\n \"index\": 0,\n \"logprobs\": null,\n \"message\": {\n \"content\": \"It looks like the image link you provided is broken or invalid. Therefore, I'm unable to see the image you are referring to. Could you please describe the image or try uploading it again? This way, I can assist you better!\",\n \"refusal\": null,\n \"role\": \"assistant\",\n \"function_call\": null,\n \"tool_calls\": null\n }\n }\n ],\n \"created\": 1723774555,\n \"model\": \"gpt-4o-2024-05-13\",\n \"object\": \"chat.completion\",\n \"service_tier\": null,\n \"system_fingerprint\": \"fp_3aa7262c27\",\n \"usage\": {\n \"completion_tokens\": 47,\n \"prompt_tokens\": 33,\n \"total_tokens\": 80\n }\n}","__computed__.cumulative_token_count.completion":"47","__computed__.cumulative_token_count.prompt":"33","__computed__.cumulative_token_count.total":"80","llm.usage.completion_tokens":"47","llm.usage.prompt_tokens":"33","llm.usage.total_tokens":"80"},"events":[{"name":"promptflow.function.inputs","timestamp":"2024-08-16T02:15:55.274591","attributes":{"payload":"{\n \"model\": \"gpt-4o\",\n \"messages\": [\n {\n \"role\": \"system\",\n \"content\": \"You are a helpful assistant.\"\n },\n {\n \"role\": \"user\",\n \"content\": \"what is this?\\n![image](Image(fb3d2be5))\"\n }\n ],\n \"temperature\": 1.0,\n \"top_p\": 1.0,\n \"n\": 1,\n \"stream\": false,\n \"max_tokens\": null,\n \"presence_penalty\": 0.0,\n \"frequency_penalty\": 0.0,\n \"user\": \"\"\n}"}},{"name":"promptflow.llm.generated_message","timestamp":"2024-08-16T02:15:56.897321","attributes":{"payload":"{\n \"content\": \"It looks like the image link you provided is broken or invalid. Therefore, I'm unable to see the image you are referring to. Could you please describe the image or try uploading it again? This way, I can assist you better!\",\n \"refusal\": null,\n \"role\": \"assistant\",\n \"function_call\": null,\n \"tool_calls\": null\n}"}},{"name":"promptflow.function.output","timestamp":"2024-08-16T02:15:56.906064","attributes":{"payload":"{\n \"id\": \"chatcmpl-9wgnLZh6QXUufosCrMUm49MjQB8yd\",\n \"choices\": [\n {\n \"finish_reason\": \"stop\",\n \"index\": 0,\n \"logprobs\": null,\n \"message\": {\n \"content\": \"It looks like the image link you provided is broken or invalid. Therefore, I'm unable to see the image you are referring to. Could you please describe the image or try uploading it again? This way, I can assist you better!\",\n \"refusal\": null,\n \"role\": \"assistant\",\n \"function_call\": null,\n \"tool_calls\": null\n }\n }\n ],\n \"created\": 1723774555,\n \"model\": \"gpt-4o-2024-05-13\",\n \"object\": \"chat.completion\",\n \"service_tier\": null,\n \"system_fingerprint\": \"fp_3aa7262c27\",\n \"usage\": {\n \"completion_tokens\": 47,\n \"prompt_tokens\": 33,\n \"total_tokens\": 80\n }\n}"}}],"links":[],"resource":{"attributes":{"service.name":"promptflow","collection":"read_foods_from_pdf_flow"},"schema_url":""},"external_event_data_uris":["9f57ddb1-a6aa-4fd1-b1d7-88f7a50771a2","4c2f11d8-5aa7-4e3f-a116-5e9708981732","9d9133f7-9518-47da-9c69-e2ccbdf52c17"]} ```
github-actions[bot] commented 1 month ago

Hi, we're sending this friendly reminder because we haven't heard back from you in 30 days. We need more information about this issue to help address it. Please be sure to give us your input. If we don't hear back from you within 7 days of this comment, the issue will be automatically closed. Thank you!