Open Laktus opened 7 months ago
@Laktus by coincidence, I just added a vercel/ai compatible StreamingResponse in the last release, see https://github.com/run-llama/create-llama/blob/main/templates/types/streaming/fastapi/app/api/routers/vercel_response.py
@marcusschiesser That looks awesome thanks! I think someone still needs to modify the template to work with it though. Or is it already integrated in the last release? The template is missing being able to handle the incoming data from useChat
's handleSubmit
as well as passing it back from server to frontend using the StreamingTextResponse
(or your vercel_response
in FastAPI).
Who would be responsible for integrating the changes of the latest FastAPI into the starter-template?
@Laktus The template is part of create-llama since npx create-llama@0.1.0 - It was just updated in npx create-llama@0.1.1 - what are you missing?
@marcusschiesser I will try integrating my changes to the backend for handling the data parameter and then will add a message if it works, thanks for the update!
@marcusschiesser Hi Marcus, i don't see any inherent integration with images in the FastAPI backend. Do you know how i can add this? How do i pass image information into the ChatMessage object when im using a image capable model like GPT-4 or GPT-4-Vision? (I already managed to pass the image from front-to-backend and back to the frontend for the display)
Thanks for any help.
@Laktus, the problem is that in Python, you have to use the MultiModalVectorStoreIndex
to use images
So I would start replacing VectorStoreIndex
with this class.
Details about using it are here: https://docs.llamaindex.ai/en/stable/examples/evaluation/multi_modal/multi_modal_rag_evaluation/?h=multimodalvectorstoreindex
If you like, you're welcome to post a diff of your code here.
@marcusschiesser But this saves the images into the vector DB or not? I don't want to populate the vector DB with the image information, but only want to attach the image to one message.
In the Vision Docs of OpenAI (https://platform.openai.com/docs/guides/vision) you can see the following possibilities of the completion API
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4-turbo",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "What’s in this image?"},
{
"type": "image_url",
"image_url": {
"url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
},
},
],
}
],
max_tokens=300,
)
print(response.choices[0])
The astream_chat
only accepts the message as a raw str
. If we could directly pass a ChatMessage
object then weit should be possible to add this additional information to the API call below or not? Why is this not supported out of the box? I think the TS version also solves it in this way.
@Laktus yes, this is a current issue of the Python version. We're working on aligning the multi-modal capabilities of the Python and the Typescript version. Once that's done, we will add image upload support to the FastAPI backend
@marcusschiesser Is there any update on when this will be implemented?
Not yet, we'll need multi-modal support in the Python framework first
Hi,
I wanted to implement custom evaluating logic. Realizing that only the python implemention of LLamaIndex supports
QuestionGenerator
i thought that it would be more reasonable to the FastAPI backend + Next.js Frontend setup.I managed to pass the data for images to the backend extending the handleSubmit of useChat for https://github.com/vercel/ai/pull/725. I however don't know how to duplicate the functionality of StreamData in the FastAPI backend.
Can you make this example work out of the box or provide some further documentation of how to implement this? Currently the multi modality does not work, without multiple changes.
Thanks for taking your time and reading my request.