langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
https://dify.ai
Other
35.79k stars 4.83k forks source link

workflow query did not return results,but recall test was successful. #5207

Open tzdmwwdz opened 2 weeks ago

tzdmwwdz commented 2 weeks ago

Self Checks

Dify version

0.6.10

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

Sometimes in the workflow, the query did not return any results, but the backend knowledge base received the query and there was no issue with the recall during testing.

✔️ Expected Behavior

The query information in the workflow and the recall results from the knowledge base are consistent.

❌ Actual Behavior

图片 图片

dosubot[bot] commented 2 weeks ago

@tzdmwwdz Hello there! I'm here to help you with bug solving, answering questions, or becoming a contributor. Let's work together to resolve any issues you may have.

To resolve the inconsistency between the workflow query results and the backend knowledge base recall results in your self-hosted Dify version 0.6.10 (Docker), you should consider the recall mode you are using. Dify supports two recall modes: N选1召回模式 (N-to-1 recall mode) and 多路召回模式 (multi-path recall mode).

  1. N选1召回模式 (N-to-1 recall mode):

    • This mode selects the most relevant single dataset based on the user's intent and dataset descriptions.
    • It is suitable for applications with fewer datasets that are distinctly different from each other.
    • If the dataset descriptions are not clear or the datasets are too similar, the recall effectiveness may be impacted.
  2. 多路召回模式 (multi-path recall mode):

    • This mode matches the user's intent with all datasets simultaneously and retrieves relevant text fragments from multiple datasets.
    • It then reorders the results using a Rerank model to select the best match for the user's query.
    • This mode is recommended for applications with multiple datasets as it does not rely on the model's inference ability or dataset descriptions.

To improve consistency, you should:

Here is a brief guide on how to switch to multi-path recall mode:

  1. Go to your Dify application settings.
  2. Navigate to the recall mode settings.
  3. Select 多路召回模式 (multi-path recall mode).
  4. Configure the Rerank model API as required.

Additionally, ensure that you have properly configured the Knowledge Base Retrieval Node by following these steps:

  1. Selecting the Query Variable: Ensure that the query variable represents the user's input question. In the "Start" node of conversational applications, the system pre-sets "sys.query" as the user input variable.
  2. Choosing the Knowledge Base for Query: Add an existing knowledge base from Dify. For instructions on creating a knowledge base, refer to the knowledge base help documentation.
  3. Configuring the Retrieval Strategy: Modify the indexing strategy and retrieval mode for the knowledge base within the node. For detailed explanations, refer to the retrieval mode explanation in the help documentation.

Moreover, ensure that your workflow configurations are synchronized correctly. You can use the WorkflowToolConfigurationUtils class to check and validate your workflow graph variables and tool parameter configurations. Here is a brief overview of the relevant methods:

By following these steps and ensuring proper synchronization of your workflow configurations, you should be able to resolve the inconsistency between the workflow query results and the backend knowledge base recall results [1][2][3][4].

To continue talking to Dosu, mention @dosu.

tzdmwwdz commented 2 weeks ago

图片 The multi-path recall is already in use.

tzdmwwdz commented 2 weeks ago

The configuration should not be the problem. This is because some queries are able to return correct results from the knowledge base. 图片