chatchat-space / Langchain-Chatchat

Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and Llama) RAG and Agent app with langchain
Apache License 2.0
31.66k stars 5.52k forks source link

rerank的内容是topk吗? #3398

Closed litongyun233 closed 7 months ago

litongyun233 commented 7 months ago

看代码,rerank的docs好像就是检索模型检索到的最相似的topk,再对这topk排序。比如选定topk=3传给大模型,rerank只是改变他们的顺序,听起来对效果提升帮助不大。请问能否实现,比如说检索出前10个相似的,再从这10个中rerank前3个,作为topk呢?

dosubot[bot] commented 7 months ago

嗨,@litongyun233!很高兴再次看到你提出问题。😊 我正在查看你的问题,很快会回复你的。🔍

litongyun233 commented 7 months ago

是不是可以自己修改参数,把docs = await run_in_threadpool(search_docs, query=query, knowledge_base_name=knowledge_base_name, top_k=top_k, score_threshold=score_threshold) 中的top_k设置成=10,, reranker_model = LangchainReranker(top_n=top_k,)的不变

litongyun233 commented 7 months ago

我们的这个项目是否使用到了上下文压缩的功能?@dosu-bot,为什么chatGLM的回答速度这么快

dosubot[bot] commented 7 months ago

Based on the provided context, it appears that the feature of context compression has not been explicitly implemented in the ChatGLM3Agent.py file of the Langchain-Chatchat project. However, the SummaryAdapter class in the kb_summary_api.py file is responsible for context compression in the Langchain-Chatchat codebase. This class forms summaries of documents or knowledge base entries, effectively compressing the context. The form_summary method of the SummaryAdapter class is used to create these summaries, leveraging two instances of language models (llm and reduce_llm) for the process. You can find more details about this in the server/knowledge_base/kb_summary_api.py file.

As for the fast response speed of ChatGLM, it could be attributed to several factors such as efficient parsing and output handling, optimized conversation memory management, use of predefined templates and structured prompts, and selective tool integration. However, these are speculative reasons based on the provided code structure and functionalities, as specific optimization techniques like context compression or detailed insights into the underlying language model's performance characteristics are not explicitly mentioned in the code.

I hope this information helps! If you have any more questions or need further clarification, feel free to ask.

Sources

#### About Dosu This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.