lobehub / lobe-chat

🤯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / Azure / DeepSeek), Knowledge Base (file upload / knowledge management / RAG ), Multi-Modals (Vision/TTS/Plugins/Artifacts). One-click FREE deployment of your private ChatGPT/ Claude application.
https://chat-preview.lobehub.com
Other
45.62k stars 10.22k forks source link

[Request] 增加私有知识库功能 #222

Closed franklili3 closed 3 months ago

franklili3 commented 1 year ago

🥰 需求描述 | Feature Description

用于企业客服,回答企业的已有问答。

🧐 解决方案 | Proposed Solution

可以使用langchain技术建立知识库,用户上传文档,机器人有限按照知识库回答。

📝 补充信息 | Additional Information

No response

arvinxx commented 1 year ago

私有知识库技术实现方案还在构思中。

因为现有的LobeChat是一个纯本地化的方案。因此如何存储管理私有知识库、处理向量化的流程,和本地文件的集成 这些都要通盘考虑。

onlinedear commented 1 year ago

太棒了,希望能够增加一个或多个文档的问答、总结,支持pdf、doc、exl等文档格式。

lobehubbot commented 1 year ago

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


Great, I hope to be able to add Q&A and summary of one or more documents, and support pdf, doc, exl and other document formats.

sunsky89757 commented 12 months ago

+1

arvinxx commented 11 months ago

感谢建议! 这两个产品我都有研究过,不过我们可能会尝试一些新的方案。原因是这两个产品都需要用户提供一个 db 和服务端,而lobe-chat目前是一个完全本地化的产品,没有db 服务。

所以如何结合还需要做这部分功能时再评估一下产品方案。

lobehubbot commented 11 months ago

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


Thanks for the suggestion! I have researched both products, but we may try some new solutions. The reason is that both products require users to provide a db and server, and lobe-chat is currently a completely localized product without a db service.

Therefore, when you still need to perform this part of the function on how to combine it, evaluate the product solution.

ddosakura commented 11 months ago

本地化产品也可以有db

最近看到的项目 https://github.com/rubickCenter/rubick 使用了 https://github.com/pouchdb/pouchdb 作为本地数据库+webdav同步数据,感觉对知识库的存储来说是个挺好的思路

知识库需要的是向量数据库,这个可以看看 https://github.com/shravansunder/hnswlib-wasm 这类可以直接在浏览器中运行的库

其它相关资料:https://localfirstweb.dev/

lobehubbot commented 11 months ago

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


Localized products can also have db

The project I saw recently https://github.com/rubickCenter/rubick uses https://github.com/pouchdb/pouchdb as a local database + webdav to synchronize data. I feel it is a good idea for the storage of knowledge base.

The knowledge base requires a vector database. For this, you can look at https://github.com/shravansunder/hnswlib-wasm, a library that can be run directly in the browser.

Other related information: https://localfirstweb.dev/

callzhang commented 9 months ago

Any updates?

fanyinghao commented 9 months ago

可以尝试开放接口的方式,第三方通过插件来实现文件处理的流程。这个项目之需要在前端提供上传文件功能即可,后续交由插件处理。

lobehubbot commented 9 months ago

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


You can try the open interface method, and third parties can implement the file processing process through plug-ins. This project only needs to provide the function of uploading files on the front end, and then it will be handled by the plug-in.

ihcbbs commented 5 months ago

可以尝试开放接口的方式,第三方通过插件来实现文件处理的流程。这个项目之需要在前端提供上传文件功能即可,后续交由插件处理。

是的,我不知道这个功能为什么到现在都还不做,像moonshot,像千问,上传了它自己就会解析,可以问答了,什么向量模型,什么数据库,都是另外的事情。

lobehubbot commented 5 months ago

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


You can try the open interface method, and third parties can implement the file processing process through plug-ins. This project only needs to provide the function of uploading files on the front end, and then it will be handled by the plug-in.

Yes, I don’t know why this function has not been implemented until now. Like moonshot or Qianwen, if you upload it, it will be parsed by itself, and you can ask questions. What vector model and what database are all other things.

arvinxx commented 5 months ago

是的,我不知道这个功能为什么到现在都还不做,像moonshot,像千问,上传了它自己就会解析,可以问答了

那要不你来 PR 个上传文件的插件?非常欢迎~

你说的这些我也知道,而且像智谱 openai 也有自己的assistant api可以传文件。但这并不是我期望的使用方案。我并不希望 LobeChat 绑定某个特定 provider ,这会导致未来扩展性存在缺陷。比较典型的就是 openai 的 assistant api一开始并不支持流式返回,这意味着如果使用他们的方案,用openai实现文件上传的会话体验就会远比普通方案差一大截。而且用文件上传的接口还和普通流式是两套,一旦这么搞维护成本就是几何倍数上增,迟早会变成💩山。

此外,这种 provider 的api 传完你就传完了,之前传的文件也没地方管理,二次复用,以及真正变成可被积累的知识库。你传给 moonshot 的文件没法直接在被千问的模型消费,用户就会非常疑惑为什么自己的文件切个 provider 就不支持了。

API 能力是能力, 但这不是简单包下就能变成足够好用的产品的。 文件上传的功能我们很快就会开始做,再耐心等待一会吧

lobehubbot commented 5 months ago

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


Yes, I don’t know why this function has not been implemented until now. Like moonshot or Qianwen, upload it and it will parse it by itself, so you can ask questions.

How about you PR a plugin for uploading files? Very welcome~

I know what you are talking about, and openai, like Zhipu, also has its own assistant api to transfer files. But this is not the usage scenario I expected. I don't want LobeChat to be bound to a specific provider, which would lead to future scalability flaws. What is typical is that OpenAI's Assistant API does not support streaming returns at the beginning, which means that if their solution is used, the session experience of using OpenAI to upload files will be far worse than the ordinary solution. Moreover, the interface for uploading files is different from that of ordinary streaming. Once this is done, the maintenance cost will increase exponentially, and sooner or later it will become a mountain.

In addition, this kind of provider's api has been uploaded to you as soon as it is uploaded. There is no place to manage the previously uploaded files, reuse them, and truly turn them into a knowledge base that can be accumulated. The files you pass to moonshot cannot be directly consumed by Qianwen's model, and users will be very confused as to why cutting their files into a provider does not support it.

API capabilities are capabilities, but they cannot be simply packaged into products that are useful enough. We will start to implement the file upload function soon, please wait patiently for a while.

wdzhwsh4067 commented 5 months ago

是的,我不知道这个功能为什么到现在都还不做,像moonshot,像千问,上传了它自己就会解析,可以问答了

那要不你来 PR 个上传文件的插件?非常欢迎~

你说的这些我也知道,而且像智谱 openai 也有自己的assistant api可以传文件。但这并不是我期望的使用方案。我并不希望 LobeChat 绑定某个特定 provider ,这会导致未来扩展性存在缺陷。比较典型的就是 openai 的 assistant api一开始并不支持流式返回,这意味着如果使用他们的方案,用openai实现文件上传的会话体验就会远比普通方案差一大截。而且用文件上传的接口还和普通流式是两套,一旦这么搞维护成本就是几何倍数上增,迟早会变成💩山。

此外,这种 provider 的api 传完你就传完了,之前传的文件也没地方管理,二次复用,以及真正变成可被积累的知识库。你传给 moonshot 的文件没法直接在被千问的模型消费,用户就会非常疑惑为什么自己的文件切个 provider 就不支持了。

API 能力是能力, 但这不是简单包下就能变成足够好用的产品的。 文件上传的功能我们很快就会开始做,再耐心等待一会吧

支持作者!

lobehubbot commented 5 months ago

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


Yes, I don’t know why this function has not been implemented until now. Like moonshot or Qianwen, upload it and it will parse it by itself, so you can ask questions.

How about you PR a plugin for uploading files? Very welcome~

I know what you are talking about, and openai, like Zhipu, also has its own assistant api to transfer files. But this is not the usage scenario I expected. I don't want LobeChat to be bound to a specific provider, which would lead to future scalability flaws. What is typical is that OpenAI's Assistant API does not support streaming returns at the beginning, which means that if their solution is used, the session experience of using OpenAI to upload files will be far worse than the ordinary solution. Moreover, the interface for uploading files is different from that of ordinary streaming. Once this is done, the maintenance cost will increase exponentially, and sooner or later it will become a mountain.

In addition, this kind of provider's API is finished when you finish uploading it. There is no place to manage the previously uploaded files, reuse them, and truly turn them into a knowledge base that can be accumulated. The files you pass to moonshot cannot be directly consumed by Qianwen's model, and users will be very confused as to why cutting their files into a provider does not support it.

API capabilities are capabilities, but they cannot be turned into products that are useful enough by simply wrapping them up. We will start to implement the file upload function soon, please wait patiently for a while.

Support the author!

lobehubbot commented 3 months ago

✅ @franklili3

This issue is closed, If you have any questions, you can comment and reply.\ 此问题已经关闭。如果您有任何问题,可以留言并回复。

lobehubbot commented 3 months ago

:tada: This issue has been resolved in version 1.12.0 :tada:

The release is available on:

Your semantic-release bot :package::rocket: