langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
https://dify.ai
Other
47.99k stars 6.85k forks source link

Workflow knowledge retrieval node is very slow #8601

Closed 18075717849 closed 3 weeks ago

18075717849 commented 3 weeks ago

Self Checks

Dify version

0.8.2

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

iwEdAqNwbmcDAQTRA04F0QMqBrD735TeLu2d8QbVrEbRHTEAB9IMp1rYCAAJomltCgAL0gACM_g png_720x720q90 The workflow knowledge retrieval node is very slow, and it takes 2 minutes to retrieve it. I want to know what the principle of knowledge retrieval is, and why the recall test is very fast, but the knowledge retrieval node is very slow. Even if I restart the service, it is still very slow.

✔️ Expected Behavior

I want to know the principle of the knowledge retrieval node of the workflow, and what are the reasons for the slowness?

❌ Actual Behavior

Workflow knowledge retrieval node is very slow

dosubot[bot] commented 3 weeks ago

The workflow knowledge retrieval node in version 0.8.2 of the self-hosted Dify (Docker) may be slow due to the complexity and configuration of the retrieval strategy. The retrieval process involves querying the knowledge base, which can be influenced by the indexing strategy and retrieval mode configured. Specifically, the "N-to-1 Recall" and "Multi-way Recall" strategies require different models and processes that can impact performance.

For example, in the "N-to-1 Recall" mode, knowledge base queries are executed through function calling, which requires the selection of a system reasoning model. In the "Multi-way Recall" mode, a Rerank model needs to be configured for result re-ranking. These configurations can add to the retrieval time depending on the complexity and size of the knowledge base, as well as the efficiency of the models used.

Additionally, as of September 1st, the recommended retrieval mode has shifted to "Multi-way Recall" due to its improved performance and accuracy. The "N-to-1 Recall" mode is no longer recommended, which might also contribute to the slowness if the older mode is still being used.

The knowledge retrieval process involves searching for text content related to the user's query from the knowledge base and using it as context for downstream LLM nodes. This process includes selecting query variables, choosing the knowledge base, setting the recall mode, and configuring downstream nodes, typically LLM nodes. The output variable result contains the retrieved text segments, which are then used as context in the LLM node to answer the user's query.

The slowness could also be attributed to the underlying implementation details. The KnowledgeRetrievalNode class in the Dify codebase shows that the retrieval process involves several steps, including querying the database for available documents, filtering and sorting document segments, and potentially re-ranking results based on configured models. These operations can be time-consuming, especially if the dataset is large or the retrieval configuration is complex.

Therefore, the slowness could be attributed to the specific retrieval strategy and model configurations used in your setup, and it is recommended to switch to the "Multi-way Recall" mode for better performance. Additionally, optimizing the database queries and retrieval logic might help improve the retrieval speed [1][2][3][4].

To continue talking to Dosu, mention @dosu.

crazywoola commented 3 weeks ago

Close as off topic.