lobehub / lobe-chat

🤯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / Azure / DeepSeek), Knowledge Base (file upload / knowledge management / RAG ), Multi-Modals (Vision/TTS) and plugin system. One-click FREE deployment of your private ChatGPT/ Claude application.
https://chat-preview.lobehub.com
Other
40.81k stars 9.29k forks source link

[Request] Switch Between URL and Base64 Modes for Image Input in Server Mode #3881

Closed fashioncj closed 1 week ago

fashioncj commented 1 week ago

🥰 Feature Description

In server mode, I need the ability to switch between URL and Base64 modes for image input via configuration options. This feature is crucial for addressing the issue where S3 (minio) cannot be accessed via the external network in a private deployment.

Use Case

In private deployments, particularly when using S3-compatible storage solutions like minio, images stored in the bucket are often not accessible from the external network. This limitation hinders the ability to use URL-based image inputs in language models such as OpenAI's API. By providing a configuration option to use Base64 encoded images instead of URLs, we can ensure seamless operation in both public and private deployment environments.

Benefits

🧐 Proposed Solution

  1. Introduce a configuration option in the server settings to toggle between URL and Base64 modes for image input.
  2. When the Base64 mode is enabled, the server should accept Base64 encoded images and decode them appropriately before processing.
  3. Ensure that the switch between modes is seamless and does not affect the existing functionality for URL-based inputs.

📝 Additional Information

This feature would greatly enhance the usability of the server mode in restricted network environments and facilitate easier integration with private storage solutions.

For reference, please see the related PR: PR #3370.

Thank you for considering this feature request. I look forward to your feedback and am available for any further clarification or discussion.

lobehubbot commented 1 week ago

👀 @fashioncj

Thank you for raising an issue. We will investigate into the matter and get back to you as soon as possible. Please make sure you have given us as much context as possible.\ 非常感谢您提交 issue。我们会尽快调查此事,并尽快回复您。 请确保您已经提供了尽可能多的背景信息。

dosubot[bot] commented 1 week ago

Hey @fashioncj, I'm here to help you with bug fixes, questions, and becoming a contributor! Let's tackle this together. 🤖

You can add a configuration option in server mode to switch between URL and Base64 modes for image input to address accessibility issues with S3-compatible storage solutions like Minio in private deployments.

Here is an example of how you might implement this configuration option:

  1. Add a new environment variable to control the mode:
# .env file
IMAGE_INPUT_MODE=URL # or BASE64
  1. Modify the server code to handle the new configuration:
import os

# Load the configuration from environment variables
image_input_mode = os.getenv('IMAGE_INPUT_MODE', 'URL')

def handle_image_input(image_data):
    if image_input_mode == 'BASE64':
        # Handle Base64 image input
        image = decode_base64_image(image_data)
    else:
        # Handle URL image input
        image = download_image_from_url(image_data)
    return image

def decode_base64_image(base64_string):
    import base64
    from io import BytesIO
    from PIL import Image

    image_data = base64.b64decode(base64_string)
    image = Image.open(BytesIO(image_data))
    return image

def download_image_from_url(url):
    import requests
    from PIL import Image
    from io import BytesIO

    response = requests.get(url)
    image = Image.open(BytesIO(response.content))
    return image
  1. Update the documentation to include the new configuration option:
### `IMAGE_INPUT_MODE`

This environment variable controls the mode of image input. It can be set to either `URL` or `BASE64`. The default value is `URL`.

- `URL`: Images are provided as URLs.
- `BASE64`: Images are provided as Base64 encoded strings.

Example:

```shell
IMAGE_INPUT_MODE=BASE64

Additionally, ensure that your S3 configuration is properly set up in your `.env` file, especially if you are using Minio in a private deployment. Here is an example configuration for Minio:

```example
# S3 相关,也即非结构化数据(文件、图片等)存储必需的环境变量
# 这里以 MinIO 为例
# 必填,S3 的 Access Key ID,对于 MinIO 来说,直到在 MinIO UI 中手动创建之前都是无效的
S3_ACCESS_KEY_ID=YOUR_S3_ACCESS_KEY_ID
# 必填,S3 的 Secret Access Key,对于 MinIO 来说,直到在 MinIO UI 中手动创建之前都是无效的
S3_SECRET_ACCESS_KEY=YOUR_S3_SECRET_ACCESS_KEY
# 必填,S3 的 Endpoint,用于服务端/客户端连接到 S3 API
S3_ENDPOINT=https://lobe-s3-api.example.com
# 必填,S3 的 Bucket,直到在 MinIO UI 中手动创建之前都是无效的
S3_BUCKET=lobe
# 必填,S3 的 Public Domain,用于客户端通过公开连接访问非结构化数据
S3_PUBLIC_DOMAIN=https://lobe-s3-api.example.com
# 选填,S3 的 Enable Path Style
# 对于主流 S3 Cloud 服务商,一般填 0 即可;对于自部署的 MinIO,请填 1
# 请参考:https://lobehub.com/zh/docs/self-hosting/advanced/s3#s-3-enable-path-style
S3_ENABLE_PATH_STYLE=1

By adding this configuration option and ensuring your S3 settings are correctly configured, you can switch between URL and Base64 modes for image input, which can help address accessibility issues with S3-compatible storage solutions like Minio in private deployments [1][2].

To continue talking to Dosu, mention @dosu.

arvinxx commented 1 week ago

I thinks it can be an env, not the ui configuration? Because if you use the local deployment, the whole image is in unreachable for external network.

fashioncj commented 1 week ago

I thinks it can be an env, not the ui configuration? Because if you use the local deployment, the whole image is in unreachable for external network.

Yes, an environment variable for deployment will suffice. This state is indeed tied to the deployment environment.

lobehubbot commented 1 week ago

✅ @fashioncj

This issue is closed, If you have any questions, you can comment and reply.\ 此问题已经关闭。如果您有任何问题,可以留言并回复。

arvinxx commented 1 week ago

@fashioncj try with LLM_VISION_IMAGE_USE_BASE64 =1

lobehubbot commented 1 week ago

:tada: This issue has been resolved in version 1.16.3 :tada:

The release is available on:

Your semantic-release bot :package::rocket: