langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
88.64k stars 13.93k forks source link

Support for adding images to the chat history (Claude 3 Sonnet, Bedrock) #20623

Open eyurtsev opened 2 months ago

eyurtsev commented 2 months ago

Discussed in https://github.com/langchain-ai/langchain/discussions/20599

Originally posted by **Martyniqo** April 18, 2024 ### Checked - [X] I searched existing ideas and did not find a similar one - [X] I added a very descriptive title - [X] I've clearly described the feature request and motivation for it ### Feature request I'm using Claude 3 Sonnet on Amazon Bedrock and storing chat history in DynamoDB. However, LangChain does not support **storing images in the chat history** and there is no way to add them as simply as the text itself: https://python.langchain.com/docs/use_cases/question_answering/chat_history/ The following code completely ignores the uploaded image in the chat history and saves only the text from the user's question and the model's answer: ` human_message = [] for attachment_uri in self.request.attachments: s3_bucket_name, s3_key = attachment_uri.replace("s3://", "").split("/", 1) encoded_image = load_image_from_s3_and_encode(s3_bucket_name, s3_key) file_extension = Path(s3_key).suffix mime_type = get_mime_type(file_extension) if encoded_image: logger.debug("Image detected") image_message = { "type": "image_url", "image_url": { "url": f"data:{mime_type};base64,{encoded_image}", }, } logger.debug(image_message) human_message.append(image_message) system_message = """You are chat assistant, friendly and polite to the user. You use history to get additional context. History might by empty, in case of new conversation. """ human_message.append({"type": "text", "text": "The user question is {question}."}) template = [ ("system", system_message), MessagesPlaceholder(variable_name="history"), ("human", human_message), ] prompt = ChatPromptTemplate.from_messages(template) chain = prompt | bedrock_chat | StrOutputParser() chain_with_history = RunnableWithMessageHistory( chain, lambda session_id: DynamoDBChatMessageHistory( table_name=DYNAMODB_TABLE_NAME, session_id=session_id ), input_messages_key="question", history_messages_key="history", ) config = {"configurable": {"session_id": self.request.session_id}} response = chain_with_history.invoke({"question": "What's on the previous image?"}, config=config) ` Probably it will be necessary to store images somewhere else, and in DynamoDB only references to them. Has anyone had a similar problem before and has "an easy" solution for it? ### Motivation The model doesn't save the image in chat history so doesn't know about which image I'm asking. ### Proposal (If applicable) _No response_
tomaszdudek7 commented 2 months ago

Yeah, would be great to have it.

rmitula commented 2 months ago

Rolling out the red carpet for this pull request to get merged :D

wuodar commented 2 months ago

It would be great to have it supported in langchain!