[Request] 支持图片嵌入对话中的任意位置

BrandonStudio commented 6 months ago

🥰 需求描述

Anthropic 建议最好在每个图片前提示这是图片1、那是图片2，但是现版本所有图片插入到同一位置。

🧐 解决方案

将输入框改为富文本，允许图片嵌入。发送请求时以图片为分隔，参考以下示例：

message = client.messages.create(
    model="claude-3-opus-20240229",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Image 1:"
                },
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": image1_media_type,
                        "data": image1_data,
                    },
                },
                {
                    "type": "text",
                    "text": "Image 2:"
                },
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": image2_media_type,
                        "data": image2_data,
                    },
                },
                {
                    "type": "text",
                    "text": "How are these images different?"
                }
            ],
        }
    ],
)

📝 补充信息

No response

lobehubbot commented 6 months ago

👀 @BrandonStudio

Thank you for raising an issue. We will investigate into the matter and get back to you as soon as possible. Please make sure you have given us as much context as possible.\ 非常感谢您提交 issue。我们会尽快调查此事，并尽快回复您。请确保您已经提供了尽可能多的背景信息。

lobehubbot commented 6 months ago

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

🥰 Description of requirements

Anthropic suggestion It is best to prompt before each picture that this is picture 1 and that is picture 2, but in the current version, all pictures are inserted into same location.

🧐 Solution

Change the input box to rich text and allow image embedding. When sending a request, use images as delimiters. Refer to the following example:

message = client.messages.create(
    model="claude-3-opus-20240229",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Image 1:"
                },
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": image1_media_type,
                        "data": image1_data,
                    },
                },
                {
                    "type": "text",
                    "text": "Image 2:"
                },
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": image2_media_type,
                        "data": image2_data,
                    },
                },
                {
                    "type": "text",
                    "text": "How are these images different?"
                }
            ],
        }
    ],
)

📝 Supplementary information

No response

arvinxx commented 6 months ago

这个要等很后面做富文本方案改造了，之前研究过，发现成本有点高，就先搁置了 https://github.com/lobehub/lobe-chat/discussions/427

lobehub / lobe-chat