Open RakeshReddyKondeti opened 2 months ago
To determine the value of the is_chat_model
parameter in LLMMetadata
for different models and frameworks, you need to check if the model exposes a chat interface. Specifically, the is_chat_model
should be set to True
if the model can be passed a sequence of messages rather than just text.
For the LLama 3.1 gguf model:
Using llama-cpp-python
: This framework supports a chat interface for the LLama 3.1 gguf model. The documentation and examples explicitly show how to set up and use the chat interface, confirming that is_chat_model
should be set to True
[1][2].
Using the Ollama framework: The Ollama framework also supports a chat interface. The metadata
property in the OllamaMultiModal
class includes the attribute is_chat_model=True
, and there are methods like chat
, stream_chat
, and astream_chat
that handle chat interactions. Therefore, is_chat_model
should be set to True
[3][4][5][6].
In summary, for both llama-cpp-python
and the Ollama framework, the is_chat_model
parameter should be set to True
for the LLama 3.1 gguf model.
To continue talking to Dosu, mention @dosu.
is_chat_model won't do anything for llama.cpp
LlamaCPP (at least the way its implemented in llama-index) will apply the provided templating function to translate chat messages into a model specific format
Ollama provides a chat endpoint that works directly with chat messages, so there is no need to provide templating functions, and hence, is_chat_model=True
Thanks @logan-markewich,
I am working with LlamaCPP. When I set is_chat_model = true
, your recent merge #16017 is not working. However, when I set is_chat_model = false
, it works as expected. I am unsure what exactly changes in this context. Could you provide more details on why this behavior occurs and how is_chat_model
influences the functionality in LlamaCPP?
Thank you!
If is_chat_model is set for llama-cpp, it applies the provided message_to_prompt function
Otherwise, it assumes the inputs are already formated, and does not touch them
If you set is_chat_model=True, provide a messages_to_prompt function to transform message dicts to the model format https://github.com/run-llama/llama_index/blob/0491ab7cd376aaeb45d725c412ff45d7f74c50b9/llama-index-integrations/llms/llama-index-llms-llama-cpp/llama_index/llms/llama_cpp/base.py#L256
Thanks for the explanation @logan-markewich,
I am using ContextPlusCondenseChatEngine
from #16017 and LlamaCPP for initializing the LLM.
I noticed that in the current implementataion of _extend_messages
method, the system prompt is appended to the chat message list even if it already exists. I am not sure if this behavior intended or is it a glitch, but this can lead to redundant system prompts in the message list, causing issues in the subsequent processing, especially in messages_to_prompt
method.
Suggested Improvement: The method should check if a system prompt already exists in the message list. If it does, it should update the existing system prompt instead of appending a new one, something like the following.
def _extend_messages(self, messages: List[ChatMessage]) -> List[ChatMessage]:
"""Add system prompt to chat message list."""
system_message_found = False
for message in messages:
if isinstance(message, ChatMessage) and message.role == MessageRole.SYSTEM:
message.content += self.system_prompt
system_message_found = True
break
if not system_message_found and self.system_prompt:
messages = [
ChatMessage(role=MessageRole.SYSTEM, content=self.system_prompt),
*messages,
]
return messages
This modified method ensures that the system prompt is only added once, preventing redundancy and potential issues in the subsequent message processing.
Question Validation
Question
Hi,
I need clarification on the
is_chat_model
parameter inLLMMetadata
. Specifically, I would like to understand whether this parameter depends on the LLM model itself or the framework used to load the model.For instance, I am using the LLama 3.1 gguf model. How can I determine whether the
is_chat_model
parameter should be set totrue
orfalse
? I have observed that when using thellama-cpp-python
framework,is_chat_model
isfalse
(the default value), but when using the Ollama framework, it istrue
.Could you please provide guidance on how to accurately determine the value of
is_chat_model
for different models and frameworks?Thank you!