langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
https://dify.ai
Other
51.67k stars 7.48k forks source link

LLM video understanding #10720

Open Dongnc1017 opened 3 hours ago

Dongnc1017 commented 3 hours ago

Self Checks

1. Is this request related to a challenge you're experiencing? Tell me about your story.

Version 0.11.1 adds support for LLM video understanding. According to #9828, qwen mode requires users to submit a work order application and only supports urlsend mode. Please tell me how to do this and where to submit a work order? In addition, what model is used to test qwen that supports video understanding?

2. Additional context or comments

No response

3. Can you help us with this feature?

dosubot[bot] commented 2 hours ago

I found two similar discussions that might be relevant to your question about LLM video understanding in Dify:

  1. How to use/process video files in workflow? - This discussion is still open and might provide insights into processing video files within Dify workflows.

  2. Not support for mp4? - This is another open discussion that addresses issues related to video file support, specifically MP4 format.

Regarding your specific questions about submitting a work order application for qwen mode and the model used for testing qwen with video understanding, these discussions might not directly address them. You might need to check the official documentation or reach out to the maintainers for detailed guidance on submitting work orders and the specific models used.

To continue talking to Dosu, mention @dosu.

hjlarry commented 1 hour ago

the doc is here