Closed lfoppiano closed 10 months ago
🤖
Hello @lfoppiano,
Great to see you again! I hope you've been doing well.
Based on the current implementation of the LangChain framework, it does indeed handle the formatting of instructions using the parameters and template provided by Ollama. This is done through the PromptTemplate.from_template
and ChatPromptTemplate.from_messages
methods, which are used to create prompt templates from string templates. These templates are then used to format the instructions.
For instance, the DEFAULT_REFINE_PROMPT
is created using the DEFAULT_REFINE_PROMPT_TMPL
string template and the PromptTemplate.from_template
method. The placeholders in the string template (e.g., {question}
, {existing_answer}
, {context_str}
) are replaced with actual values when the prompt is used. You can find this in the refine_prompts.py file.
As for the interaction with the Ollama API, it is handled by the ChatOllama
class, which formats the list of base messages as text and calls the appropriate methods from the _OllamaCommon
superclass with the formatted messages and other parameters. The responses from these methods are then converted into appropriate objects and returned or yielded. This is implemented in the ollama.py file.
Therefore, you do not need to manually format your instructions using the parameters and template provided by Ollama. The LangChain framework takes care of this for you.
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
@dosubot, what is the difference between ChatOllama and Ollama? In the documentation I've seen mentioned only the Ollama class..
🤖
The Ollama
and ChatOllama
classes in the LangChain framework are both designed to run large language models locally, but they are used in slightly different contexts.
The Ollama
class is used for generating text based on provided prompts. It inherits from the BaseLLM
and _OllamaCommon
classes and provides a number of configurable parameters to control the behavior of the language model. The main functionality is provided by the _generate
method, which calls out to Ollama's generate endpoint to generate text based on the provided prompts. The generated text is returned as a LLMResult
object. Here is an example of how to use the Ollama
class:
from langchain.llms import Ollama
ollama = Ollama(model="llama2")
response = ollama._generate(["Tell me a joke."])
On the other hand, the ChatOllama
class is designed to generate responses from a given list of messages. It is a subclass of BaseChatModel
and _OllamaCommon
. The main purpose of this class is provided by the _generate
method, which takes a list of BaseMessage
objects, formats them into a prompt, and then sends this prompt to the Ollama's generate endpoint. The endpoint returns a response, which is then converted into a ChatResult
object containing the generated message and additional information about the generation. Here is an example of how to use the ChatOllama
class:
from langchain.chat_models import ChatOllama
ollama = ChatOllama(model="llama2")
response = ollama([
HumanMessage(content="Tell me about the history of AI")
])
The BaseLLM
class that Ollama
inherits from is an abstract base class that provides a common interface for all language models. It has several attributes and methods, such as cache
, verbose
, callbacks
, callback_manager
, tags
, metadata
, and others. It also provides methods like raise_deprecation
, set_verbose
, OutputType
, __call__
, predict
, predict_messages
, _identifying_params
, __str__
, _llm_type
, dict
, and save
.
Unfortunately, the _OllamaCommon
class is not mentioned in the provided context, so I cannot provide information about its functionalities. You can find more details about these classes in the LangChain repository.
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
To add to the discussion, if you are using ChatPromptTemplate
to supply a history of messages, you should use ChatOllama
and not Ollama
.
from langchain.chat_models import ChatOllama
from langchain_core.prompts import ChatPromptTemplate
llm = ChatOllama(model="mistral:instruct")
prompt = ChatPromptTemplate.from_messages(
[
("human", "When was Abraham Lincoln born?"),
("ai", "Abraham Lincoln was born on February 12, 1809."),
("human", "How old was he when he died?"),
("ai", "Abraham Lincoln died on April 15, 1865, at the age of 56."),
("human", "{question}"),
]
)
chain = prompt | llm
ans = chain.invoke({"question":"Where did he die?"})
print(ans)
This will send the correct API request. I verified this with tcpdump
.
{
"messages": [
{
"role": "user",
"content": "When was Abraham Lincoln born?",
"images": []
},
{
"role": "assistant",
"content": "Abraham Lincoln was born on February 12, 1809.",
"images": []
},
{
"role": "user",
"content": "How old was he when he died?",
"images": []
},
{
"role": "assistant",
"content": "Abraham Lincoln died on April 15, 1865, at the age of 56.",
"images": []
},
{
"role": "user",
"content": "Where did he die?",
"images": []
}
],
"model": "mistral:instruct",
"format": null,
"options": {
"mirostat": null,
"mirostat_eta": null,
"mirostat_tau": null,
"num_ctx": null,
"num_gpu": null,
"num_thread": null,
"num_predict": null,
"repeat_last_n": null,
"repeat_penalty": null,
"temperature": null,
"stop": [],
"tfs_z": null,
"top_k": null,
"top_p": null
},
"system": null,
"template": null
}
Ollama server will now apply the model's template before sending the input to the model.
In the same situation if you use Ollama
:
llm = Ollama(model="mistral:instruct")
The request will look like this:
{
"prompt": "Human: When was Abraham Lincoln born?\nAI: Abraham Lincoln was born on February 12, 1809.\nHuman: How old was he when he died?\nAI: Abraham Lincoln died on April 15, 1865, at the age of 56.\nHuman: Where did he die?",
"images": null,
"model": "mistral:instruct",
"format": null,
"options": {
"mirostat": null,
"mirostat_eta": null,
"mirostat_tau": null,
"num_ctx": null,
"num_gpu": null,
"num_thread": null,
"num_predict": null,
"repeat_last_n": null,
"repeat_penalty": null,
"temperature": null,
"stop": [],
"tfs_z": null,
"top_k": null,
"top_p": null
},
"system": null,
"template": null
}
This may or may not work well depending on the target model. But this is not an ideal situation. ChatOllama
will work better.
Issue with current documentation:
It's not clear from the documentation whether when calling Ollama, langchain will take care of formatting the template correctly or if I have to supply the template by myself.
For example, in https://ollama.ai/library/mistral:instruct
we have:
Do I have to take care of formatting my instructions using these parameters and template, or langchain will take care of it?
Idea or request for content:
If this is not implemented, would be very useful to have definitely