langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
https://dify.ai
Other
37.8k stars 5.13k forks source link

In the workflow, the knowledge base can be directly selected through input parameters #5187

Open lzzzzzzzzz opened 1 month ago

lzzzzzzzzz commented 1 month ago

Self Checks

1. Is this request related to a challenge you're experiencing?

I am maintaining the knowledge base through the API, but I need to configure multiple branch flows through if tags in the workflow to select a specific knowledge base. I hope to directly specify the knowledge base through input parameters

2. Describe the feature you'd like to see

the knowledge base can be dynamically selected through input parameters, not the if node

3. How will this feature improve your workflow or experience?

Reduce duplicate design of similar knowledge base branches

4. Additional context or comments

No response

5. Can you help us with this feature?

nathubs commented 3 weeks ago

In many situation,it seems very useful to dynamic Selection of Knowledge Datasets in Knowledge Retrieval Nodes

zawilliams commented 2 weeks ago

Would love to have this functionality to be able to select a knowledge dataset based on an ID for use with embedded chat.

VincePotato commented 2 weeks ago

I understand that you want to dynamically pass in the knowledge base ID to select the knowledge base. Could you provide more background to help me understand in which scenarios this feature is needed? For instance, how do you obtain the knowledge base ID?

lzzzzzzzzz commented 2 weeks ago

I understand that you want to dynamically pass in the knowledge base ID to select the knowledge base. Could you provide more background to help me understand in which scenarios this feature is needed? For instance, how do you obtain the knowledge base ID?

At present, Dify does not support the function of uploading documents, and it is also unable to operate file streams in workflows. Therefore, I think this can break free from the limitations of Dify's own functions and maintain the knowledge base through APIs. This way, users can directly upload documents each time, create a temporary knowledge base, pass it into the workflow to parse the results, and then delete this knowledge base after the workflow ends Because I have previously used Langchain-chatchat v0.2, this project provide uploading and chatting temporary documents, but it has also removed this feature after version 0.3

VincePotato commented 2 weeks ago

We are designing the file upload and file transfer between nodes in our workflow. Additionally, we will have session-level variables for temporarily storing data such as temporary text from file parsing. This feature might be able to resolve your issue?

zawilliams commented 2 weeks ago

My use-case is having an external app that users upload their documents to and then can chat with their documents. I want to avoid re-embedding every time as the documents won't change. Well, they will eventually change, but the idea is they keep documents uploaded and we aren't dealing with new documents every time they chat.

So the idea would be that the documents get embedded and vectorized using the Knowledge Base API. We'd create a Knowledge Base, upload the documents, and then somehow get the ID of the knowledge base (through the API?), and then be able to use the embeddable chat or the Chat API and pass in a Knowledge Base ID.

For example, with the embeddable chat:

<script>
 window.difyChatbotConfig = {
  token: '8YIYGK1ukZdu1W9G',
  baseUrl: 'http://localhost',
  knowledgeBaseId: '6d7d8238-5625-48a5-a3e2-673acfc99fea'
 }
</script>
<script
 src="http://localhost/embed.min.js"
 id="8YIYGK1ukZdu1W9G"
 defer>
</script>
<style>
  #dify-chatbot-bubble-button {
    background-color: #1C64F2 !important;
  }
</style>

Or:

<iframe
 src="http://localhost/chatbot/8YIYGK1ukZdu1W9G?knowledgeBaseId=6d7d8238-5625-48a5-a3e2-673acfc99fea"
 style="width: 100%; height: 100%; min-height: 700px"
 frameborder="0"
 allow="microphone">
</iframe>

Or a Chat API request:

curl -X POST 'http://localhost/v1/chat-messages' \
--header 'Authorization: Bearer {api_key}' \
--header 'Content-Type: application/json' \
--data-raw '{
    "inputs": {},
    "query": "Can you tell me what the main topics of my documents are?",
    "response_mode": "streaming",
    "conversation_id": "",
    "knowledge_base_id: "6d7d8238-5625-48a5-a3e2-673acfc99fea",
    "user": "abc-123",
    "files": []
}'
lzzzzzzzzz commented 2 weeks ago

We are designing the file upload and file transfer between nodes in our workflow. Additionally, we will have session-level variables for temporarily storing data such as temporary text from file parsing. This feature might be able to resolve your issue?

This is consistent with the final effect I want Restricted by the display of model context length, it would be great if temporary text could be retrieved as fragments in a knowledge base The function of controlling the knowledge base through parameters mentioned above can also be considered as part of future plans. I can control the knowledge base and prompt through parameters, so that this workflow can be applied to multiple scenarios without the need for repeated addition and editing

wangiii commented 1 week ago

I developed a dynamic knowledge base feat, which allows the dynamic dataset id list to be passed as a parameter, allowing the required dynamic dataset id list to be obtained through the tool in the workflow and then used in the knowledge retrieval node. I hope this feature can help you.

https://github.com/langgenius/dify/pull/5928

lzzzzzzzzz commented 1 week ago

I developed a dynamic knowledge base feat, which allows the dynamic dataset id list to be passed as a parameter, allowing the required dynamic dataset id list to be obtained through the tool in the workflow and then used in the knowledge retrieval node. I hope this feature can help you.

5928

Thank you for building, that's exactly what I want And it has richer functions than expected

zawilliams commented 1 week ago

That PR sounds great - I tried pulling down the branch to test it but couldn't get it to work. I'm guessing I did something wrong.

wangiii commented 1 week ago

That PR sounds great - I tried pulling down the branch to test it but couldn't get it to work. I'm guessing I did something wrong.

Are you using dynamic dataset Id List configuration? Could you describe the specific problems and expectations to me?

zawilliams commented 1 week ago

Are you using dynamic dataset Id List configuration? Could you describe the specific problems and expectations to me?

I'm not seeing any way to select a dynamic dataset.

In the knowledge retrieval node, select the dynamic knowledge base drop-down box and specify the parameters of the corresponding datasetid list.

I have created a Knowledge, then created a workflow and added a Knowledge Retrieval node. What am I doing wrong?

Screenshot 2024-07-06 at 2 27 57 PM
wangiii commented 1 week ago

Are you using dynamic dataset Id List configuration?是否使用动态数据集 ID 列表配置? Could you describe the specific problems and expectations to me?您能向我描述一下具体的问题和期望吗?

I'm not seeing any way to select a dynamic dataset.我没有看到任何选择动态数据集的方法。

In the knowledge retrieval node, select the dynamic knowledge base drop-down box and specify the parameters of the corresponding datasetid list.在知识检索节点中,选择动态知识库下拉框,并指定对应 datasetid 列表的参数。

I have created a Knowledge, then created a workflow and added a Knowledge Retrieval node. What am I doing wrong?我创建了一个知识,然后创建了一个工作流,并添加了一个知识检索节点。我做错了什么?

Screenshot 2024-07-06 at 2 27 57 PM

This PR has not been merged into the main branch yet. You can first pull the pr code and run it. https://github.com/langgenius/dify/pull/5928

zawilliams commented 1 week ago

This PR has not been merged into the main branch yet. You can first pull the pr code and run it. #5928

Correct. I did pull and run the PR code. Do I need to do anything with building the front-end compared to how you would normally run it in the docs?

wangiii commented 1 week ago

This PR has not been merged into the main branch yet. You can first pull the pr code and run it. #5928此 PR 尚未合并到主分支中。您可以先拉取 pr 代码并运行它。 #5928

Correct. I did pull and run the PR code. Do I need to do anything with building the front-end compared to how you would normally run it in the docs?正确。我确实拉取并运行了 PR 代码。与通常在文档中运行前端的方式相比,我需要在构建前端方面做任何事情吗?

You need to install the front-end dependencies.

npm install
npm run build

Then view in the knowledge retrieval node.

image