c0sogi / LLMChat

A full-stack Webui implementation of Large Language model, such as ChatGPT or LLaMA.
MIT License
245 stars 40 forks source link

Too many prompts #21

Closed Torhamilton closed 1 year ago

Torhamilton commented 1 year ago

Why so many prompts for a simple 1 sentence chat? The embeddings are fine, but I think app is sending same chat message to openai in a loop

6:00 AM 16 requests 6:05 AM Local time: May 21, 2023, 2:05 AM gpt-4-0314, 1 request 271 prompt + 6 completion = 277 tokens 6:05 AM Local time: May 21, 2023, 2:05 AM text-embedding-ada-002-v2, 4 requests 600 prompt + 0 completion = 600 tokens 6:10 AM Local time: May 21, 2023, 2:10 AM gpt-4-0314, 3 requests 1,498 prompt + 155 completion = 1,653 tokens 6:10 AM Local time: May 21, 2023, 2:10 AM text-embedding-ada-002-v2, 2 requests 37 prompt + 0 completion = 37 tokens 6:15 AM Local time: May 21, 2023, 2:15 AM gpt-4-0314, 4 requests 5,737 prompt + 148 completion = 5,885 tokens 6:15 AM Local time: May 21, 2023, 2:15 AM text-embedding-ada-002-v2, 2 requests 14 prompt + 0 completion = 14 tokens

c0sogi commented 1 year ago

This is because if the token limit you set is exceeded during message output, ChatLengthException is automatically generated, tokens are emptied as much as ChatConfig.extra_token_margin (approximately 500 tokens) and then automatically retransmitted. This is inefficient because the number of queries sent to openai increases, but this method was adopted because the user does not have to worry about the token limit.

check the log And check how many times token limit exceeded occurred.

            except ChatLengthException:
                api_logger.error("token limit exceeded")
                if len(user_chat_context.user_message_histories) == len(user_chat_context.ai_message_histories):
                    deleted_while_overriding: int | None = await MessageManager.set_message_history_safely(
                        user_chat_context=user_chat_context,
                        role=ChatRoles.AI,
                        index=-1,
                        new_content=content_buffer.replace(ChatConfig.continue_message, "")
                        + ChatConfig.continue_message,
                        update_cache=False,
                        extra_token_margin=ChatConfig.extra_token_margin,
                    )
                    if deleted_while_overriding is not None:
                        deleted_histories += deleted_while_overriding

                else:
                    deleted_histories = await MessageManager.add_message_history_safely(
                        user_chat_context=user_chat_context,
                        role=ChatRoles.AI,
                        content=content_buffer,
                        update_cache=False,
                        extra_token_margin=ChatConfig.extra_token_margin,
                    )
                continue