Closed Torhamilton closed 1 year ago
This is because if the token limit you set is exceeded during message output, ChatLengthException is automatically generated, tokens are emptied as much as ChatConfig.extra_token_margin (approximately 500 tokens) and then automatically retransmitted. This is inefficient because the number of queries sent to openai increases, but this method was adopted because the user does not have to worry about the token limit.
check the log And check how many times token limit exceeded
occurred.
except ChatLengthException:
api_logger.error("token limit exceeded")
if len(user_chat_context.user_message_histories) == len(user_chat_context.ai_message_histories):
deleted_while_overriding: int | None = await MessageManager.set_message_history_safely(
user_chat_context=user_chat_context,
role=ChatRoles.AI,
index=-1,
new_content=content_buffer.replace(ChatConfig.continue_message, "")
+ ChatConfig.continue_message,
update_cache=False,
extra_token_margin=ChatConfig.extra_token_margin,
)
if deleted_while_overriding is not None:
deleted_histories += deleted_while_overriding
else:
deleted_histories = await MessageManager.add_message_history_safely(
user_chat_context=user_chat_context,
role=ChatRoles.AI,
content=content_buffer,
update_cache=False,
extra_token_margin=ChatConfig.extra_token_margin,
)
continue
Why so many prompts for a simple 1 sentence chat? The embeddings are fine, but I think app is sending same chat message to openai in a loop
6:00 AM 16 requests 6:05 AM Local time: May 21, 2023, 2:05 AM gpt-4-0314, 1 request 271 prompt + 6 completion = 277 tokens 6:05 AM Local time: May 21, 2023, 2:05 AM text-embedding-ada-002-v2, 4 requests 600 prompt + 0 completion = 600 tokens 6:10 AM Local time: May 21, 2023, 2:10 AM gpt-4-0314, 3 requests 1,498 prompt + 155 completion = 1,653 tokens 6:10 AM Local time: May 21, 2023, 2:10 AM text-embedding-ada-002-v2, 2 requests 37 prompt + 0 completion = 37 tokens 6:15 AM Local time: May 21, 2023, 2:15 AM gpt-4-0314, 4 requests 5,737 prompt + 148 completion = 5,885 tokens 6:15 AM Local time: May 21, 2023, 2:15 AM text-embedding-ada-002-v2, 2 requests 14 prompt + 0 completion = 14 tokens