Open kyle-redyeti opened 4 months ago
Here is a test script I tried:
import base64
import httpx
import rclpy
from llama_ros.langchain import LlamaROS
from langchain_core.prompts import ChatPromptTemplate
from langchain.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
rclpy.init()
# create the llama_ros llm for langchain
llm = LlamaROS()
# create a prompt template
image_prompt = ChatPromptTemplate.from_messages(
[
("system", "Describe the image?"),
(
"user",
[
{
"type": "image_url",
"image_url": {"url": "data:image/jpeg;base64,{image_data}"},
}
],
),
])
image_question_chain = (
{
"image_data": RunnablePassthrough()
}
| image_prompt
| llm
| StrOutputParser()
)
# create a chain with the llm and the prompt template
image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
image_data = base64.b64encode(httpx.get(image_url).content).decode("utf-8")
for c in image_question_chain.stream(image_data):
print(c, flush=True, end="")
# First I tried this and it seemed to never respond so I tried stream (above) too...
# answer = image_question_chain.invoke(image_data)
# print(f"ANSWER: {answer}")
rclpy.shutdown()
Hey @kyle-redyeti, first of all, how are you running llava-mistral? You may need to set the namespace to llava lm = LlamaROS(namespace="llava")
. I've been reading the LangChain code, I think there is VLM support for ChatPromptTemplate
so a new Chat model must be implemented for llama_ros, similar to the ChatOllama that can handle images, or modify LlamaROS to add the image the action goal.
I wanted to check first if I was missing something that already existed. Next, I think I will try just having the llava llama ros running and not use the Langchain for image queries but still use it for RAG/text. I think that will get me by for awhile until I can dig in deeper on a bigger change.
If I am using: LlamaROS(namespace="llava") am I right in think I could still use a GenerateResponse with an image like the Llava_demo_node does?
Thanks for your help!
Kyle
On Sun, Jul 21, 2024, 12:26 PM Miguel Ángel González Santamarta < @.***> wrote:
Here https://github.com/mgonzs13/llama_ros?tab=readme-ov-file#llava_ros you have an example with the LangChain wrapper modified.
— Reply to this email directly, view it on GitHub https://github.com/mgonzs13/llama_ros/issues/9#issuecomment-2241719607, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF2TVXBKNR2K7652HYW723LZNPVMLAVCNFSM6AAAAABLE5WYASVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENBRG4YTSNRQG4 . You are receiving this because you were mentioned.Message ID: @.***>
If I am using: LlamaROS(namespace="llava") am I right in think I could still use a GenerateResponse with an image like the Llava_demo_node does? Thanks for your help! Kyle
Yes, the wrapper will add the image to the GenerateResponse goal similar to the demo.
@mgonzs13 Thank you again for the help! I was able to get the example script to run and it does produce a response as I expected. I will have to do some work to get it into my pipeline with your Chatbot_ROS and Explainable_ROS.
Hello, I have been try to use the chatbot_ros project with the explicability_ros project and decided that I wanted to try to get it to describe images. I chased my tail on trying to get a proper prompt generated for such a thing but I think the issue comes down to that when it calls back to llama_ros in langchain even though I am using the llava-mistral-7b model there is something in the langchain wrapper that is not providing the correct input to the model. Maybe it is as simple as if I am passing an image I need to make sure it is properly added to GenerateResponse.action???
Any help you can provide would be greatly apperciated!
Kyle