openai.APIError: The model produced invalid content.

chrisyang-menos commented 3 months ago

Checked other resources

[x] I added a very descriptive title to this issue.
[X] I searched the LangChain documentation with the integrated search.
[X] I used the GitHub search to find a similar question and didn't find it.
[X] I am sure that this is a bug in LangChain rather than my code.
[X] The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

@tool
def request_bing(query : str) -> str:
    """
    Searches the internet for additional information.
    Specifically useful when you need to answer questions about current events or the current state of the world.
    Prefer Content related to finance.
    """
    url = "https://api.bing.microsoft.com/v7.0/search"
    headers = {"Ocp-Apim-Subscription-Key": os.getenv("AZURE_KEY")}
    params = {"q": query}
    response = requests.get(url, headers=headers, params=params)
    response.raise_for_status()
    data = response.json()
    snippets_list = [result['snippet'] for result in data['webPages']['value']]
    snippets = "\n".join(snippets_list)
    return snippets

Error Message and Stack Trace (if applicable)

openai.APIError: The model produced invalid content.

Description

I'm using langchain ReAct agent + tools and starting from Jun 23rd seems to receive lots of exceptions.

openai.APIError: The model produced invalid content.

Suspecting that openai side changed something on function calling? Could you please shed some lights on it?

I'm using gpt-4o as the llm. Bing search tool defined as above.

System Info

langchain==0.2.4 langchain-community==0.2.4 langchain-core==0.2.6 langchain-openai==0.1.8 langchain-text-splitters==0.2.1

ccurme commented 3 months ago

Thanks @chrisyang-menos. Are you able to provide additional snippets to help debug? Ideally there is code a developer can run that immediately reproduces the error and does not require Azure credentials.

Would suggest the following:

run request_bing.invoke({"query": query}) to see what exactly is output by the tool.
make a dummy/mock tool that hard-codes that output and returns it.
incorporate that tool into whatever code you are using (e.g., ChatOpenAI, AgentExecutor, etc.) and paste it here.

chrisyang-menos commented 3 months ago

Thanks @ccurme.

After digging into this a bit more and leveraging LangSmith, I found the following cause:

with the @tool decorator, when checking the prompt used to call OpenAI, the tool name was recognized as request_bing(query : str) -> str instead of request_bing, which caused OpenAI exception.

LangSmith log


You have access to the following tools:

    request_bing(query: str) -> str - Searches the internet for additional information.
Specifically useful when you need to answer questions about current events or the current state of the world.
Prefer Content related to finance.

Similarly, I have another tool using create_retriever_tool from langchain.tools.retriever which had similar problem, such that the tool name was messed up.

For example, I was using this to create the tool


def get_vectordb_retriever_tool(db_choice = 'pinecone-serverless'):
    """
    CREATE A TOOL FOR AGENT USE
    USAGE: Internal document search
    """
    ## Get Retriever
    if db_choice == 'local':
        retriever = get_chromadb_retriever(constants.CHROMADB_PERSISTENT_PATH)
    elif db_choice == 'pinecone-serverless':
        retriever = get_pinecone_retriever()
    elif db_choice == 'mongodb':
        retriever = get_mongodb_retriever()

    ## Create Retrieval Tool
    vectorstore_search = create_retriever_tool(
        retriever = retriever,
        name="vectorstore_search",
        description="Retrieve relevant info from a vectorstore that contains information from internal documents. Always return the title and page information about the document.",
    )
    return vectorstore_search`

and from LangSmith

   You have access to the following tools:

    vectorstore_search(query: 'str', *, retriever: 'BaseRetriever' = VectorStoreRetriever(tags=['PineconeVectorStore', 'OpenAIEmbeddings'], vectorstore=<langchain_pinecone.vectorstores.PineconeVectorStore object at 0x7f263af17070>, search_kwargs={'k': 5}), document_prompt: 'BasePromptTemplate' = PromptTemplate(input_variables=['page_content'], template='{page_content}'), document_separator: 'str' = '\n\n', callbacks: 'Callbacks' = None) -> 'str' - Retrieve relevant info from a vectorstore that contains information from internal documents. Always return the title and page information about the document.

Wondering if you have some suggestions on what might be causing this? For the request_bing tool, I was able to get around by redefining it as a BaseTool.

CGamesPlay commented 3 months ago

I also noticed this today, when using gpt-4o. I was using a stop sequence, and I found that removing the stop sequence fixed the error for me. This error does not happen with gpt-3.5-turbo. I suspect the problem is on openai's end.

krishnakumar18 commented 2 months ago

I'm facing this issue too and the other threads where this issue is being discussed does not solve the issue for me. Any help is much appreciated. Thanks!

brainframe-me commented 2 months ago

Same here, randomly: openai.APIError: The model produced invalid content. Consider modifying your prompt if you are seeing this error persistently.

chrisyang-menos commented 2 months ago

Btw in my case, I was able to resolve the above issue by revising the tool definitions. In my experience, the following way would work well fairly consistently:

class SearchInput(BaseModel):
    query: str = Field(description="should be a search query")

class SomeSearchTool(BaseTool):
    name = "..."
    description = "xxx"
    args_schema: Type[BaseModel] = SearchInput

    def _run()
        ...

langchain-ai / langchain