[Bug] Gemini与Claude模型在使用客户端请求模式时，使用模型视觉功能报错

Kevin7Qi commented 2 months ago

📦 Environment

Docker

📌 Version

1.15.31 (lobe-chat-database)

💻 Operating System

macOS

🌐 Browser

Chrome

🐛 Bug Description

使用客户端请求模式调用 Gemini/Claude 模型时，上传图片后进行对话，返回错误结果。如果不使用客户端请求模式，则可以正常返回结果。

📷 Recurrence Steps

使用客户端请求模式调用 Gemini/Claude 模型。
在请求中包含需要使用模型视觉功能的任务，例如图像描述、图像问答等。
模型返回错误信息。

🚦 Expected Behavior

Gemini/Claude 模型在使用客户端请求模式时可以使用视觉功能

📝 Additional Information

无

lobehubbot commented 2 months ago

👀 @Kevin7Qi

Thank you for raising an issue. We will investigate into the matter and get back to you as soon as possible. Please make sure you have given us as much context as possible.\ 非常感谢您提交 issue。我们会尽快调查此事，并尽快回复您。请确保您已经提供了尽可能多的背景信息。

arvinxx commented 2 months ago

先切服务端模式吧

lobehubbot commented 2 months ago

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

Let’s switch to server mode first.

dosubot[bot] commented 2 months ago

Hey @Kevin7Qi! I'm here to help you with any bugs, questions, or contributions. Let's work together to tackle this issue!

I found a similar unsolved discussion regarding the Gemini model not being able to receive images ^[1].

For the issue you're experiencing with the visual functionality in client request mode for the Gemini and Claude models, here are some key points and configurations that might help:

Error Handling Mechanisms:
- The LobeAnthropicAI class in the Lobe Chat framework has comprehensive error handling mechanisms. It throws specific errors like InvalidAnthropicAPIKey for authentication errors and ProviderBizError for service overloads. It also handles other HTTP status codes appropriately, such as throwing a LocationNotSupportError for 403 errors and desensitizing custom base URLs in error messages ^[2].
Dependencies and Configurations:
- To enable the visual functionality in the client request mode, you need to configure the "Support Visual Recognition" setting under the "Model Configuration" section. This setting will enable the configuration of image uploads in the application, but the actual support for visual recognition depends on the model itself.
- Additionally, you need to enable the "Use Client-Side Fetching Mode" setting to initiate the session request directly from the browser, which can improve response speed ^[3]^[4]^[5].

Here are the relevant configurations:

{
  "customModelCards": {
    "modelConfig": {
      "vision": {
        "extra": "This configuration will only enable image upload capabilities within the application. Whether recognition is supported depends entirely on the model itself; please test the model's visual recognition capabilities on your own.",
        "title": "Support Visual Recognition"
      }
    }
  }
}

{
  "fetchOnClient": {
    "desc": "Client-side fetching mode initiates session requests directly from the browser, improving response speed.",
    "title": "Use Client-Side Fetching Mode"
  }
}

Please ensure these configurations are correctly set up in your environment. If the issue persists, it might be related to specific limitations or bugs in the models' visual functionality when using client request mode.

To continue talking to Dosu, mention @dosu.

lobehub / lobe-chat