[X] I added a very descriptive title to this issue.
[X] I searched the LangChain documentation with the integrated search.
[X] I used the GitHub search to find a similar question and didn't find it.
[X] I am sure that this is a bug in LangChain rather than my code.
[X] The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
Example Code
Hugging Face pipeline now has support for chat templates. This calls the apply_chat_template() of the tokenizer. This is a super useful feature which formats the input correctly according to the model. To apply the template one needs to pass a messages list to the pipeline as input (and not a prompt text).
Langchain's HuggingFacePipeline class is written in a way that only prompt text is passed to the pipeline. We can see this in the HuggingFacePipeline._generate method. As a result the prompt is constructed using Langchain's default template which is not the same as what the model works best with.
Let's build an example.
import torch
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_community.llms.huggingface_pipeline import HuggingFacePipeline
def test_agnostic_prompt(llm):
prompt = ChatPromptTemplate.from_messages(
[
("human", "When was Abraham Lincoln born?"),
("ai", "Abraham Lincoln was born on February 12, 1809."),
("human", "How old was he when he died?"),
("ai", "Abraham Lincoln died on April 15, 1865, at the age of 56."),
("human", "{question}"),
]
)
output_parser = StrOutputParser()
chain = prompt | llm | output_parser
reply = chain.invoke({"question": "Where did he die?"})
print(reply)
hf_llm = HuggingFacePipeline.from_model_id(
model_id="TinyLlama/TinyLlama-1.1B-Chat-v1.0",
task="text-generation",
pipeline_kwargs={"max_new_tokens": 128})
test_agnostic_prompt(hf_llm)
This sends the following prompt.
Human: When was Abraham Lincoln born?
AI: Abraham Lincoln was born on February 12, 1809.
Human: How old was he when he died?
AI: Abraham Lincoln died on April 15, 1865, at the age of 56.
Human: Where did he die?
The correct prompt, if chat template was applied, would be:
<|user|>
When was Abraham Lincoln born?</s>
<|assistant|>
Abraham Lincoln was born on February 12, 1809.</s>
<|user|>
How old was he when he died?</s>
<|assistant|>
Abraham Lincoln died on April 15, 1865, at the age of 56.</s>
<|user|>
Where did he die?</s>
<|assistant|>
Error Message and Stack Trace (if applicable)
No response
Description
The HuggingFacePipeline class should what is necessary to convert the ChatPromptTemplate to a messages list and then pass it to the pipeline. This will cause the pipeline to use apply_chat_template() of the tokenizer to correctly format the prompt.
System Info
System Information
OS: Linux
OS Version: #1 SMP Sat Feb 24 09:50:35 UTC 2024
Python Version: 3.10.13 | packaged by conda-forge | (main, Dec 23 2023, 15:36:39) [GCC 12.3.0]
Checked other resources
Example Code
Hugging Face pipeline now has support for chat templates. This calls the
apply_chat_template()
of the tokenizer. This is a super useful feature which formats the input correctly according to the model. To apply the template one needs to pass a messages list to the pipeline as input (and not a prompt text).Langchain's
HuggingFacePipeline
class is written in a way that only prompt text is passed to the pipeline. We can see this in theHuggingFacePipeline._generate
method. As a result the prompt is constructed using Langchain's default template which is not the same as what the model works best with.Let's build an example.
This sends the following prompt.
The correct prompt, if chat template was applied, would be:
Error Message and Stack Trace (if applicable)
No response
Description
The
HuggingFacePipeline
class should what is necessary to convert theChatPromptTemplate
to a messages list and then pass it to the pipeline. This will cause the pipeline to useapply_chat_template()
of the tokenizer to correctly format the prompt.System Info
System Information
Package Information
Packages not installed (Not Necessarily a Problem)
The following packages were not found: