Base64 images get truncated using AgentExecutor with create_openai_tools_agent

Checked other resources

[X] I added a very descriptive title to this issue.
[X] I searched the LangChain documentation with the integrated search.
[X] I used the GitHub search to find a similar question and didn't find it.
[X] I am sure that this is a bug in LangChain rather than my code.
[X] The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

import base64
from io import BytesIO
import requests

from PIL import Image
from langchain.tools import tool
from langchain_openai import ChatOpenAI
from langchain.agents import create_openai_tools_agent, AgentExecutor
from langchain_core.messages import SystemMessage
from langchain_core.prompts import ChatPromptTemplate, HumanMessagePromptTemplate, MessagesPlaceholder

@tool
def download_image(url:str,
                   local_save_path:str="/home/victor/Desktop/image_llm.jpeg") -> str:

    """Downloads and returns an image given a url as parameter"""

    try:
        # Send a HTTP request to the URL
        response = requests.get(url, stream=True)
        # Check if the request was successful
        response.raise_for_status()

        img_content = response.content
        image_stream = BytesIO(img_content)
        pil_image = Image.open(image_stream)
        pil_image.save(local_save_path)
        buffered = BytesIO()
        pil_image.save(buffered, format='JPEG', quality=85)
        base64_image = base64.b64encode(buffered.getvalue()).decode()
        src = f"data:image/jpeg;base64,{base64_image}"
        print(len(src))
        return src

    except requests.HTTPError as http_err:
        print(f"HTTP error occurred: {http_err}")
    except Exception as err:
        print(f"An error occurred: {err}")

tools = [download_image]

llm = ChatOpenAI(temperature=0,
                             model='gpt-4-turbo',
                             api_key='YOUR_API_KEY)

template_messages = [SystemMessage(content="You are helpful assistante"),
                     MessagesPlaceholder(variable_name='chat_history', optional=True),
                     HumanMessagePromptTemplate.from_template("{user_input}"),
                     MessagesPlaceholder(variable_name='agent_scratchpad')]
prompt = ChatPromptTemplate.from_messages(template_messages)

agent = create_openai_tools_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, 
                               tools=tools,
                               verbose=True,
                               max_iterations=3)
def ask_agent(user_input,
              chat_history,
              agent_executor):

    agent_response = agent_executor.invoke({"user_input": user_input, "chat_history": chat_history})
    print(len(agent_response["output"]))
    return agent_response["output"]

if __name__ == '__main__':

    user_input = "Please show the following image: https://upload.wikimedia.org/wikipedia/commons/1/1e/Demonstrations_in_Victoria6.jpg"
    chat_history = []
    agent_response = ask_agent(user_input,
                               chat_history,
                               agent_executor)

Error Message and Stack Trace (if applicable)

No response

Description

I am building a chat langchain agent powered by openai models. The agent is part of the backend of a web app that has a frontend where the user can interact with the agent.

The goal of this agent is to do some tool calling when the user message requires to do so. Some of the tools require to download images and send them to frontend so the user can visualize them. This process is done by encoding the images with base64, so that they are displayed correctly to the user.

The problem I am facing is that base64 image gets truncated when the agent finishes the chain and returns the answer. As an example, the base64 image that is downloaded by download_image has a length of 54443, while the answer returned by the agent has a length of 5762. This means that the image gets truncated by the agent. I am not completely sure why this happens, but maybe it is related with the maximum number of tokens that the agent can handle.

Some alternatives that I have tried, but failed to make this work:

Reduce the image size: the image gets truncated anyway
Reduce the image quality: the image gets truncated anyway
Try do divide the image in chunks: works fine, but after I ask the agent to reassemble the chunks, it gets truncated.
Reduce the max_iteration parameter in AgentExecutor but the problems persists

I guess I could get into more low level stuff and try to override some default configuration of the agent, but first I wanted to ask for help to solve this problem.

System Info

langchain==0.1.20 langchain-community==0.0.38 langchain-core==0.1.52 langchain-openai==0.1.6 langchain-text-splitters==0.0.1

platform: Distributor ID: Ubuntu Description: Ubuntu 20.04.6 LTS Release: 20.04 Codename: focal

Python 3.8.10

langchain-ai / langchain