Open HarryZhou-618 opened 2 months ago
I had a similar problem too. I wrote codes like this:
import asyncio
from typing import AsyncIterable
import fastapi_poe as fp
from sse_starlette.sse import ServerSentEvent
from fastapi_poe.types import ContentType, ProtocolMessage, Attachment, PartialResponse
api_key = 'KEY'
prompt = """ Describe the attachment image in detail."""
attachment = Attachment(url="https://pfst.cf2.poecdn.net/base/image/xxxxxxxxxxxxxxxxxxxxxxx?w=1024&h=1024", \
content_type="image/png", name="image.png")
message = fp.ProtocolMessage(role="user", content=prompt, attachments=[attachment])
async def get_bot_response(messages: list[ProtocolMessage], bot_name: str, api_key: str) -> AsyncIterable[PartialResponse | ServerSentEvent]:
chuncks = []
async for partial in fp.get_bot_response(messages=[message], bot_name=bot_name, api_key=api_key):
chuncks.append(partial.text)
print(''.join(chuncks))
asyncio.run(get_bot_response([message], 'Claude-3-Sonnet', api_key))
I expected the Claude model to read the attached image, but it obviously did not, and returned the following information: "Unfortunately, you have not actually attached or uploaded any images to our conversation yet. If you do upload an image, I will be happy to describe it in detail for you. Please let me know once you have attached an image."
I wonder if it is possible to invoke a multi-modal via API, thanks.
I had a similar problem too. I wrote codes like this:
import asyncio from typing import AsyncIterable import fastapi_poe as fp from sse_starlette.sse import ServerSentEvent from fastapi_poe.types import ContentType, ProtocolMessage, Attachment, PartialResponse api_key = 'KEY' prompt = """ Describe the attachment image in detail.""" attachment = Attachment(url="https://pfst.cf2.poecdn.net/base/image/xxxxxxxxxxxxxxxxxxxxxxx?w=1024&h=1024", \ content_type="image/png", name="image.png") message = fp.ProtocolMessage(role="user", content=prompt, attachments=[attachment]) async def get_bot_response(messages: list[ProtocolMessage], bot_name: str, api_key: str) -> AsyncIterable[PartialResponse | ServerSentEvent]: chuncks = [] async for partial in fp.get_bot_response(messages=[message], bot_name=bot_name, api_key=api_key): chuncks.append(partial.text) print(''.join(chuncks)) asyncio.run(get_bot_response([message], 'Claude-3-Sonnet', api_key))
I expected the Claude model to read the attached image, but it obviously did not, and returned the following information: "Unfortunately, you have not actually attached or uploaded any images to our conversation yet. If you do upload an image, I will be happy to describe it in detail for you. Please let me know once you have attached an image."
I wonder if it is possible to invoke a multi-modal via API, thanks.
Yes I got the same response when using claude model. While I was checking the latest documentation and api code, I found out that poe has added a new parsed_content field for attachment, I wonder if this would be a way to do it, maybe we can handle the image as a parsed_content style, I'm trying it out, and you can try it too!
I had a similar problem too. I wrote codes like this:
import asyncio from typing import AsyncIterable import fastapi_poe as fp from sse_starlette.sse import ServerSentEvent from fastapi_poe.types import ContentType, ProtocolMessage, Attachment, PartialResponse api_key = 'KEY' prompt = """ Describe the attachment image in detail.""" attachment = Attachment(url="https://pfst.cf2.poecdn.net/base/image/xxxxxxxxxxxxxxxxxxxxxxx?w=1024&h=1024", \ content_type="image/png", name="image.png") message = fp.ProtocolMessage(role="user", content=prompt, attachments=[attachment]) async def get_bot_response(messages: list[ProtocolMessage], bot_name: str, api_key: str) -> AsyncIterable[PartialResponse | ServerSentEvent]: chuncks = [] async for partial in fp.get_bot_response(messages=[message], bot_name=bot_name, api_key=api_key): chuncks.append(partial.text) print(''.join(chuncks)) asyncio.run(get_bot_response([message], 'Claude-3-Sonnet', api_key))
I expected the Claude model to read the attached image, but it obviously did not, and returned the following information: "Unfortunately, you have not actually attached or uploaded any images to our conversation yet. If you do upload an image, I will be happy to describe it in detail for you. Please let me know once you have attached an image." I wonder if it is possible to invoke a multi-modal via API, thanks.
Yes I got the same response when using claude model. While I was checking the latest documentation and api code, I found out that poe has added a new parsed_content field for attachment, I wonder if this would be a way to do it, maybe we can handle the image as a parsed_content style, I'm trying it out, and you can try it too!
Did you solved this problem?I tried add parsed_content field but useless.
+1
Hi, I'm using the poe api to call a multimodal model, like gpt-4v or claude3-opus. I refer to an example in the diagram, but I can't find the code on how to load the local image into the request. May I know how can I implement this? I noticed that the new documentation mentions "attachment.parsed_content", should I use this? What is the format of parsed_content? Should I process the image to base64 or use binary read? Looking for your reply![Snipaste_2024-04-12_18-18-12](https://github.com/poe-platform/fastapi_poe/assets/59814103/46f75cfa-6b4e-4711-95d3-e0098539614a)