langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
https://dify.ai
Other
49.21k stars 7.05k forks source link

The chat api supports base64 image files #9430

Open deific opened 1 week ago

deific commented 1 week ago

Self Checks

1. Is this request related to a challenge you're experiencing? Tell me about your story.

When chatting through the API interface, currently the file only supports the methods of remote_url and local_file, and in my scenario, there are limitations when using both methods. I hope to be able to upload files using base64 format image content.

2. Additional context or comments

No response

3. Can you help us with this feature?

crazywoola commented 1 week ago

The current two upload methods have covered most of the scenarios, and if the base64 file upload is open, it will cause some unnecessary problems, such as you will encounter variable length limitations.

I would like to know more about why you need this, can you give me a example or case that can describe this feature request.

deific commented 6 days ago

The current two upload methods have covered most of the scenarios, and if the base64 file upload is open, it will cause some unnecessary problems, such as you will encounter variable length limitations.

I would like to know more about why you need this, can you give me a example or case that can describe this feature request.

The remote_url method requires uploading the image to a publicly accessible location. For scenarios that require protecting user privacy, publicly accessing the image may be problematic. The local file upload method relies on dify's file upload service, and the image itself may be a part of the user's data asset, which also needs to be stored and accessed. Because chatting requires uploading to dify again, the interaction is not very friendly. And after DIY storage, it may be useless in most scenarios (after the chat ends). The base64 method is simpler and will be stored in the chat session. Of course, size is an issue and may be more suitable for some small or compressed image scenes.