Closed litongyun233 closed 7 months ago
嗨,@litongyun233!很高兴再次看到你提出问题。😊 我正在查看你的问题,很快会回复你的。🔍
是不是可以自己修改参数,把docs = await run_in_threadpool(search_docs, query=query, knowledge_base_name=knowledge_base_name, top_k=top_k, score_threshold=score_threshold) 中的top_k设置成=10,, reranker_model = LangchainReranker(top_n=top_k,)的不变
我们的这个项目是否使用到了上下文压缩的功能?@dosu-bot,为什么chatGLM的回答速度这么快
Based on the provided context, it appears that the feature of context compression has not been explicitly implemented in the ChatGLM3Agent.py
file of the Langchain-Chatchat project. However, the SummaryAdapter
class in the kb_summary_api.py
file is responsible for context compression in the Langchain-Chatchat codebase. This class forms summaries of documents or knowledge base entries, effectively compressing the context. The form_summary
method of the SummaryAdapter
class is used to create these summaries, leveraging two instances of language models (llm
and reduce_llm
) for the process. You can find more details about this in the server/knowledge_base/kb_summary_api.py file.
As for the fast response speed of ChatGLM, it could be attributed to several factors such as efficient parsing and output handling, optimized conversation memory management, use of predefined templates and structured prompts, and selective tool integration. However, these are speculative reasons based on the provided code structure and functionalities, as specific optimization techniques like context compression or detailed insights into the underlying language model's performance characteristics are not explicitly mentioned in the code.
I hope this information helps! If you have any more questions or need further clarification, feel free to ask.
看代码,rerank的docs好像就是检索模型检索到的最相似的topk,再对这topk排序。比如选定topk=3传给大模型,rerank只是改变他们的顺序,听起来对效果提升帮助不大。请问能否实现,比如说检索出前10个相似的,再从这10个中rerank前3个,作为topk呢?