Open anantgupta129 opened 1 month ago
this is the code i ran separately, on loading the chat store once this error is not coming
chat_store = SimpleChatStore.from_persist_path(CHAT_PERSIST_PATH)
memory = ChatMemoryBuffer.from_defaults(
chat_store=chat_store,
chat_store_key="temp_chat",
)
llm = OpenAI(model="gpt-4o")
tool_spec = DataBaseToolSpec().to_tool_list()
agent = OpenAIAgent.from_tools(
tool_spec, llm=llm,
verbose=True,
system_prompt=system_prompt,
memory=memory
)
# press q to quit the chat
print("[INFO] Press q to quit the chat.")
while True:
ask = input("Ask: ")
if ask.lower() == "q":
print("[INFO] Quitting...")
exit()
response = agent.stream_chat(ask)
print("\n[x Agent]: ")
for text in response.response_gen:
print(text, end="", flush=True)
print("\n" + "=" * 25 + "\n")
Hey there, @anantgupta129! I'm here to help you out with any bugs, questions, or contributions you might have. Let's work together to solve this issue.
The openai.BadRequestError
you're encountering is due to the OpenAI API's requirement that messages with the role tool
must follow a preceding message with the role tool_calls
. This error typically arises when the sequence of messages is not maintained correctly, especially when loading the chat store from a persistent path.
Here are some steps to resolve this issue:
Ensure Correct Sequence of Messages: Verify that the sequence of messages in your chat store is correct. Messages with the role tool
should always follow a message with the role tool_calls
.
Persist and Load Chat Store Correctly: When persisting and loading the chat store, ensure that the state is consistent and no messages are lost or reordered.
Use File Locking and Atomic Writes: Implement file locking and atomic writes to avoid race conditions and ensure data consistency when multiple requests are accessing the chat store concurrently.
Here is an updated version of your API code with these considerations:
import json
import os
import fsspec
import threading
from fastapi import FastAPI, HTTPException, Body
from fastapi.responses import StreamingResponse
from pathlib import Path
from llama_index.core.llms import ChatMessage
from llama_index.core.storage.chat_store.simple_chat_store import SimpleChatStore
from llama_index.core.memory.chat_memory_buffer import ChatMemoryBuffer
from llama_index.agent.openai_legacy.openai_agent import OpenAIAgent
from llama_index.agent.openai_legacy.tools.database_tool_spec import DataBaseToolSpec
app = FastAPI()
CHAT_PERSIST_PATH = "chat_store.json"
_lock = threading.Lock()
def stream_generator(generator, chat_store: SimpleChatStore):
yield from (json.dumps({"type": "content_block", "text": text}) for text in generator)
with _lock:
chat_store.persist(persist_path=CHAT_PERSIST_PATH)
@app.post("/chat")
async def chat(body: ChatRequest = Body()):
try:
with _lock:
if Path(CHAT_PERSIST_PATH).exists():
chat_store = SimpleChatStore.from_persist_path(CHAT_PERSIST_PATH)
else:
chat_store = SimpleChatStore()
memory = ChatMemoryBuffer.from_defaults(
chat_store=chat_store,
chat_store_key=body.chatId,
)
tool_spec = DataBaseToolSpec().to_tool_list()
agent = OpenAIAgent.from_tools(
tool_spec, llm=llm, verbose=True, system_prompt=system_prompt, memory=memory
)
response = agent.stream_chat(body.query)
return StreamingResponse(
stream_generator(response.response_gen, chat_store), media_type="application/x-ndjson"
)
except Exception as e:
raise HTTPException(status_code=500, detail=str(e)) from e
_lock
) to ensure that only one thread can access the chat store at a time.By ensuring the correct sequence of messages and using file locking and atomic writes, you can avoid the openai.BadRequestError
and maintain a consistent state across multiple API requests [1].
To continue the conversation, mention @dosu.
@dosu still getting same error
To resolve the openai.BadRequestError
with error code 400, ensure that any message with the role tool
is correctly preceded by a message with the role tool_calls
. This error typically occurs when the sequence of messages does not follow the expected order.
Here is a more detailed approach to ensure the correct sequence when loading the chat store:
tool
is preceded by a message with the role tool_calls
.Here is an example of how you might implement this check and adjustment:
from typing import List
from llama_index.legacy.llms import ChatMessage
from llama_index.legacy.storage.chat_store.simple_chat_store import SimpleChatStore
def validate_and_correct_messages(messages: List[ChatMessage]) -> List[ChatMessage]:
corrected_messages = []
for i, message in enumerate(messages):
if message.role == 'tool':
if i == 0 or messages[i-1].role != 'tool_calls':
raise ValueError("Message with role 'tool' must be preceded by a message with role 'tool_calls'")
corrected_messages.append(message)
return corrected_messages
# Example usage with SimpleChatStore
chat_store = SimpleChatStore.from_persist_path("chat_store.json")
key = "your_specific_key"
messages = chat_store.get_messages(key)
try:
validated_messages = validate_and_correct_messages(messages)
chat_store.set_messages(key, validated_messages)
except ValueError as e:
print(f"Error in message sequence: {e}")
Additionally, ensure that when you are using the OpenAI-based function, the tool_calls
are correctly handled. The following snippet from the llama_index
library shows how tool_calls
are parsed and validated:
def _parse_tool_calls(
tool_calls: List[OpenAIToolCall],
output_cls: Type[Model],
allow_multiple: bool = False,
verbose: bool = False,
) -> Union[Model, List[Model]]:
outputs = []
for tool_call in tool_calls:
function_call = tool_call.function
assert function_call is not None
assert function_call.name is not None
assert function_call.arguments is not None
if verbose:
name = function_call.name
arguments_str = function_call.arguments
print(f"Function call: {name} with args: {arguments_str}")
if isinstance(function_call.arguments, dict):
output = output_cls.parse_obj(function_call.arguments)
else:
output = output_cls.parse_raw(function_call.arguments)
outputs.append(output)
if allow_multiple:
return outputs
else:
if len(outputs) > 1:
_logger.warning(
"Multiple outputs found, returning first one. "
"If you want to return all outputs, set output_multiple=True."
)
return outputs[0]
}
By ensuring the correct sequence of messages and proper handling of tool_calls
, you can resolve the openai.BadRequestError
with error code 400 [1][2][3].
To continue the conversation, mention @dosu.
hmm, if you run memory.get()
on the loaded memory, what do you see?
I could have sworn I made a fix recently so that the memory buffer would include tool/assistant message pairs
@logan-markewich if I do chat_store.json()
or agent.chat_history
, the complete message is there, including tool calls. I even checked the stored JSON manually, and it is correct. However, I am facing this error when the chat becomes longer. As I mentioned, this error only occurs when I am loading the chat history on each request, not when running it in a while loop
I am receiving this error as well @logan-markewich . I am using the redis chat store and the openai agent. It looks like someone on reddit is getting it too.
This happens when I have > 5 messages and several tool calls.
This seems to relate to:
Any ideas or update here?
Edit: One idea I am having actually, could this relate to the token_limit
in the chat memory buffer?
Is there any way to check if this is the issue?
@bllchmbrs this is not related to token length or limit according to me. as when i tested this without loading chat store again which each call it was working fine check this.
And its me on reddit also
I see, so your experience is:
is that right?
@bllchmbrs no, in both the code i am using chat store the only different is what.
You can also experiment this
I see, it's the reloading of the chat memory based on the key.
We have the same issue. I have to do a recovery now (clear the history, summarize it) to work around this.
@bllchmbrs @anantgupta129 do both of you have the latest version of the chat store? Like I mentioned above, I'm like 86% sure this is fixed
pip show llama-index-storage-chat-store-redis
will show the current version
You can update with pip install -U llama-index-chat-store-redis
Probably should also update core
pip install -U llama-index-core
I am on the latest version of both core (0.10.40
) and chat-store-redis (0.1.3
).
I think it's something else.
Hi team, I am also getting this error while using multiple tools under the same agent. It appears when I share the same memory buffer with multiple tools
still no fix @logan-markewich
I tried to get a repro in a jupyter notebook (even with multiple juypter notebooks to the same store). I couldn't get it.
I am using fastapi and run_in_threadpool
for requests. Could this have something to do with multi-threading?
@bllchmbrs I doubt this is related to fastapi cause I tested without it also, could u share the code
Bug Description
getting
openai.BadRequestError: Error code: 400 - {'error': {'message': "Invalid parameter: messages with role 'tool' must be a response to a preceeding message with 'tool_calls'
error when using json chat store with persistent paths, but when i checked the stored json tool_calls is saved before role tool and the chats are also saved by chat_store.I am receiving this error in long chats but only when i am loading the chat store again for specific key, when i tested separately in while loop it works fine without error
Version
0.10.38
Steps to Reproduce
here is the api code
i need to load the chat store on every request as want to save different chats and multiple user
Relevant Logs/Tracbacks