langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
https://dify.ai
Other
52.36k stars 7.64k forks source link

feat: support LLM process document file #10966

Closed hjlarry closed 5 hours ago

hjlarry commented 13 hours ago

Summary

[!Tip] Close issue syntax: Fixes #<issue number> or Resolves #<issue number>, see documentation for more details.

Currently, lots of LLM( gemini, sonnet ...) can directly process document, and make user's chat context based on these documents. This PR aimed to support this feature in a dify agent app. For the chatflow app, maybe this PR can resolve.

ChangeList

Backend

Frontend

remaining issues

Screenshots

image

Checklist

[!IMPORTANT]
Please review the checklist below before submitting your pull request.

laipz8200 commented 9 hours ago

Thank you for this awesome contribution! Some changes in #10679 are still in processing, I'll review this PR after #10679 is merged.

laipz8200 commented 9 hours ago

the vision settings is strange, Resolution only affect Image type files, Upload Method and Upload Limit affect all type files.

I think this resolution config should be removed in the future.

laipz8200 commented 7 hours ago

Hi @hjlarry! #10679 is merged, could you please sync the code with the main branch?

hjlarry commented 7 hours ago

Hi @hjlarry! #10679 is merged, could you please sync the code with the main branch?

Done :)

laipz8200 commented 6 hours ago

https://github.com/user-attachments/assets/e4c34422-e458-4ddb-9e38-92cd72e1f350

hjlarry commented 6 hours ago

Screen.Recording.2024-11-22.at.5.56.05.PM.mov

seems the icon has been overwrite by the merge action, please try again