Closed bottlehounds-dev closed 8 months ago
🤖
It's great to see that you've identified the issues and even proposed some workarounds for the gemini
LLM problem. Your insights and proposed solutions are valuable. If you're able to implement these workarounds and test them, it would be fantastic if you could create a pull request with your changes. Your contributions would be greatly appreciated!
probably a decent solution is
@logan-markewich what you just said is essentially what I ended up doing in my subclass of ReActChatFormatter
. I just feel like we can do better.
So I'm going to try a few things out and see if I can't come up with something that I can whip up a PR for! The goal would be to have more stable gemini
support in llama-index
's agentic components.
Hi @bottlehounds-dev. I've been using query engines, chat engines and react agents with gemini models, both pro and ultra and they have been working fine. There was a time they weren't working because of problems with the system message: user message mapping but I raised a similar issue and it was fixed.
What modules are not working on your end?
@gich2009 thanks for chiming in! This is my work GitHub account, apologies for the confusion.
I am using the Vertex
LLM
abstraction to access gemini-pro
. This is because I am using the vertex API to authenticate. Are you using the Gemini
abstraction? If so then that is something we will try!
Hi @bottlehounds-dev, no worries.
That explains it. I usually use the Gemini abstraction directly. I've never tried running it behind the vertex abstraction. Seems to be a Vertex-Gemini issue. I'll experiment later. It was probably an oversight when the gemini abstraction was fixed.
The gemini one works pretty well. Try it out and tell me if you encounter challenges because I'd be curious to see what they are.
Bug Description
Hey all!
I'm sure that you're aware of this, but using a
gemini
LLM in a query engine, chat engine, or agent isn't possible out of the box.Issues
There are two main issues (that I am aware of):
gemini
does not support messages with theSYSTEM
role, but these are always used by default bylammaindex
in:
- The TEXT_QA_SYSTEM_PROMPT for response synthesizers
- When using a
ReActAgent
, the default chat formatter prepends the history with the agent instructions as aSYSTEM
message in ReActChatFormatter.formatAn error is thrown in vertex_utils._parse_chat_history when the chat history is not an even number of messages, this is guaranteed to happen when:
- Using the default QA prompts for response synthesizers, as they are composed of two messages
- The aforementioned ReActChatFormatter.format method will prepend the history with another message containing agent instructions
Workarounds
- For the response synthesizer issues, we can simply override
text_qa_template
to only contain a singleUSER
, taking care of both theSYSTEM
message and the odd number of messages in the chat history.- For the agent, we can only solve the
SYSTEM
message issue (I did this by subclassingReActChatFormatter
and updating the message role fromSYSTEM
toUSER
Blocker
Finally we have come to the issue that I do not have a workaround for, outside of reimplementing the
Vertex
LLM and/or how it is used with agents.I do not know how to avoid the uneven number of chat history messages when using
gemini
with aReAct
agent. The agent will always need to give its instructions/context. Should I simply attempt to combine this with the user message?Version
latest
Steps to Reproduce
Response Synthesizer
- Create a query engine with a response synthesizer that uses the
gemini
LLM and run a query on itAgent
- Create a ReAct agent with the
gemini
LLM and run a chat on itRelevant Logs/Tracbacks
No response
Hello,I seem to have the same problem as you, but I don't have a suitable solution yet. Could I take a look at your code for this section? Much appreciated!!!!
@shuozhu1 I think this is fixed tbh
pip install -U llama-index-llms-vertex
should have it working
@shuozhu1 I think this is fixed tbh
pip install -U llama-index-llms-vertex
should have it working
i will try. thanks for your help!!!(but my llm is llama-13b-chinese actually)
@shuozhu1 @logan-markewich sorry friends, I meant to close this one out! <3
Bug Description
Hey all!
I'm sure that you're aware of this, but using a
gemini
LLM in a query engine, chat engine, or agent isn't possible out of the box.Issues
There are two main issues (that I am aware of):
gemini
does not support messages with theSYSTEM
role, but these are always used by default bylammaindex
in:ReActAgent
, the default chat formatter prepends the history with the agent instructions as aSYSTEM
message in ReActChatFormatter.formatWorkarounds
text_qa_template
to only contain a singleUSER
, taking care of both theSYSTEM
message and the odd number of messages in the chat history.SYSTEM
message issue (I did this by subclassingReActChatFormatter
and updating the message role fromSYSTEM
toUSER
Blocker
Finally we have come to the issue that I do not have a workaround for, outside of reimplementing the
Vertex
LLM and/or how it is used with agents.I do not know how to avoid the uneven number of chat history messages when using
gemini
with aReAct
agent. The agent will always need to give its instructions/context. Should I simply attempt to combine this with the user message?Version
latest
Steps to Reproduce
Response Synthesizer
gemini
LLM and run a query on itAgent
gemini
LLM and run a chat on itRelevant Logs/Tracbacks
No response