iuiaoin / wechat-gptbot

A wechat robot based on ChatGPT with no risk, very stable! 🚀
MIT License
579 stars 113 forks source link

[Bug]: 对话截断的逻辑好像有问题 #118

Open QAbot-zh opened 6 months ago

QAbot-zh commented 6 months ago

Search for answers in existing issues

Python version

python 3.8

Issue description

通过max_tokens来截断对话长度,是节省token消耗的有效手段,这是合理的。 在chatgpt.py中是这么调用save函数的:

    def reply(self, context: Context) -> Reply:
        ...
            if response["completion_tokens"] > 0:
                Session.save_session(
                    response["content"], session_id, response["total_tokens"]
                )
            return Reply(ReplyType.TEXT, response["content"])

而在session.py里,save和discard对话的函数是这样的:

    @staticmethod
    def save_session(answer, session_id, total_tokens):
        max_tokens = conf().get("max_tokens")
        session = Session.all_sessions.get(session_id)
        if session:
            # append conversation
            gpt_item = {"role": "assistant", "content": answer}
            session.append(gpt_item)

        # discard exceed limit conversation
        Session.discard_exceed_conversation(session, max_tokens, total_tokens)

    @staticmethod
    def discard_exceed_conversation(session, max_tokens, total_tokens):
        dec_tokens = int(total_tokens)
        while dec_tokens > max_tokens:
            # pop first conversation
            if len(session) > 3:
                session.pop(1)
                session.pop(1)
            else:
                break
            dec_tokens = dec_tokens - max_tokens

max_tokens是对话长度的截断阈值,total_tokens是问答消息的总长度,严谨的逻辑应该是,不断pop掉最前面的问答直到session的内容长度小于max_tokens。这样的话dec_tokens每次要减掉的不是max_tokens,而是被pop掉的问答的对话长度。不知道是否是因为难以估计被pop掉的对话长度而直接使用max_tokens。

Repro steps

No response

Relevant log output

No response