Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
[X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
[X] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
[X] Please do not modify this template :) and fill in all the required fields.
1. Is this request related to a challenge you're experiencing? Tell me about your story.
In my daily work, I often need to use larger models to synthesize a large amount of data to enrich my dataset. This dataset is then used for fine-tuning smaller models or testing RAG capabilities.
Currently, Dify provides a batch execution feature, but it seems there is no management page that allows me to stop batch tasks at any time, view historical tasks, and manage the generated results. I achieve these functionalities through scripts, but I'm looking for a more streamlined, end-to-end experience.
2. Additional context or comments
I propose three main features:
Batch Execution Management: An entry on the workflow orchestration page that leads to a batch management page. Here, users can observe batch tasks run in the background, with real-time updates on a dashboard. Users can view and stop tasks at any time, obtain current execution results, and view the history of executed tasks.
Knowledge-Base-Writing Node: A node where the input can be vectors output by the LLM embedding model. In this node, users can specify the knowledge base to save to, and after execution, the vectors and origin text are added to the knowledge base.
The Combination of Both Functions: Each batch task will generate a single document in the knowledge base, and the text generated each time (along with the corresponding vector) corresponds to a paragraph in the document.
I would like to contribute code, provide feedback, and assist with testing to ensure the feature is implemented effectively and meets user needs.
3. Can you help us with this feature?
[X] I am interested in contributing to this feature.
Dify is rebuilding the tool system. For now, you can make it by using the HTTP request node or code block to call your knowledge base API.
This is similar to the second point. You can currently do it on your own, but the knowledge base may take time to process RAG and rerank, and you might hit the LLM provider's rate limits since this will make batch access, so it is not recommended for now.
Self Checks
1. Is this request related to a challenge you're experiencing? Tell me about your story.
In my daily work, I often need to use larger models to synthesize a large amount of data to enrich my dataset. This dataset is then used for fine-tuning smaller models or testing RAG capabilities.
Currently, Dify provides a batch execution feature, but it seems there is no management page that allows me to stop batch tasks at any time, view historical tasks, and manage the generated results. I achieve these functionalities through scripts, but I'm looking for a more streamlined, end-to-end experience.
2. Additional context or comments
I propose three main features:
Batch Execution Management: An entry on the workflow orchestration page that leads to a batch management page. Here, users can observe batch tasks run in the background, with real-time updates on a dashboard. Users can view and stop tasks at any time, obtain current execution results, and view the history of executed tasks.
Knowledge-Base-Writing Node: A node where the input can be vectors output by the LLM embedding model. In this node, users can specify the knowledge base to save to, and after execution, the vectors and origin text are added to the knowledge base.
The Combination of Both Functions: Each batch task will generate a single document in the knowledge base, and the text generated each time (along with the corresponding vector) corresponds to a paragraph in the document.
I would like to contribute code, provide feedback, and assist with testing to ensure the feature is implemented effectively and meets user needs.
3. Can you help us with this feature?