HuggingFacePipeline does not use chat template

bibhas2 commented 7 months ago

Checked other resources

[X] I added a very descriptive title to this issue.
[X] I searched the LangChain documentation with the integrated search.
[X] I used the GitHub search to find a similar question and didn't find it.
[X] I am sure that this is a bug in LangChain rather than my code.
[X] The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

Hugging Face pipeline now has support for chat templates. This calls the apply_chat_template() of the tokenizer. This is a super useful feature which formats the input correctly according to the model. To apply the template one needs to pass a messages list to the pipeline as input (and not a prompt text).

Langchain's HuggingFacePipeline class is written in a way that only prompt text is passed to the pipeline. We can see this in the ‎HuggingFacePipeline._generate method. As a result the prompt is constructed using Langchain's default template which is not the same as what the model works best with.

Let's build an example.

import torch
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_community.llms.huggingface_pipeline import HuggingFacePipeline

def test_agnostic_prompt(llm):
    prompt = ChatPromptTemplate.from_messages(
        [
            ("human", "When was Abraham Lincoln born?"),
            ("ai", "Abraham Lincoln was born on February 12, 1809."),
            ("human", "How old was he when he died?"),
            ("ai", "Abraham Lincoln died on April 15, 1865, at the age of 56."),
            ("human", "{question}"),
        ]
    )

    output_parser = StrOutputParser()

    chain = prompt | llm | output_parser

    reply = chain.invoke({"question": "Where did he die?"})

    print(reply)

hf_llm = HuggingFacePipeline.from_model_id(
        model_id="TinyLlama/TinyLlama-1.1B-Chat-v1.0",
        task="text-generation",
        pipeline_kwargs={"max_new_tokens": 128})

test_agnostic_prompt(hf_llm)

This sends the following prompt.

Human: When was Abraham Lincoln born?
AI: Abraham Lincoln was born on February 12, 1809.
Human: How old was he when he died?
AI: Abraham Lincoln died on April 15, 1865, at the age of 56.
Human: Where did he die?

The correct prompt, if chat template was applied, would be:

<|user|>
When was Abraham Lincoln born?</s> 
<|assistant|>
Abraham Lincoln was born on February 12, 1809.</s> 
<|user|>
How old was he when he died?</s> 
<|assistant|>
Abraham Lincoln died on April 15, 1865, at the age of 56.</s> 
<|user|>
Where did he die?</s> 
<|assistant|>

Error Message and Stack Trace (if applicable)

No response

Description

The HuggingFacePipeline class should what is necessary to convert the ChatPromptTemplate to a messages list and then pass it to the pipeline. This will cause the pipeline to use apply_chat_template() of the tokenizer to correctly format the prompt.

System Info

System Information

OS: Linux OS Version: #1 SMP Sat Feb 24 09:50:35 UTC 2024 Python Version: 3.10.13 | packaged by conda-forge | (main, Dec 23 2023, 15:36:39) [GCC 12.3.0]

Package Information

langchain_core: 0.1.38 langchain: 0.1.14 langchain_community: 0.0.31 langsmith: 0.1.24 langchain_openai: 0.1.1 langchain_text_splitters: 0.0.1

Packages not installed (Not Necessarily a Problem)

The following packages were not found:

langgraph langserve

liugddx commented 7 months ago

Let me see

olegarch commented 6 months ago

any workaround available?

mihaic commented 3 months ago

A workaround is using HuggingFacePipeline via ChatHuggingFace. This is actually somewhat documented:

Instantiate the ChatHuggingFace to apply chat templates https://python.langchain.com/v0.2/docs/integrations/chat/huggingface/#2-instantiate-the-chathuggingface-to-apply-chat-templates

Note that until a release of langchain-huggingface beyond 0.0.3 is made, you need to apply the following commit manually because of #22804: https://github.com/langchain-ai/langchain/commit/4796b7eb15b4c3a352c950f56a30659f0379f6e2

ItzBrein commented 3 months ago

A workaround is using HuggingFacePipeline via ChatHuggingFace. This is actually somewhat documented:

Instantiate the ChatHuggingFace to apply chat templates https://python.langchain.com/v0.2/docs/integrations/chat/huggingface/#2-instantiate-the-chathuggingface-to-apply-chat-templates

Note that until a release of langchain-huggingface beyond 0.0.3 is made, you need to apply the following commit manually because of #22804: 4796b7e

Hello, do you have any examples of code that uses both HuggingFacePipeline and ChatHuggingFace? My attempts to use both keeps resulting in errors.

Soumil32 commented 3 months ago

I am facing the same problem with quantized variants of Meta Llama 3 8B-Instruct

Soumil32 commented 3 months ago

A workaround is using HuggingFacePipeline via ChatHuggingFace. This is actually somewhat documented:

Instantiate the ChatHuggingFace to apply chat templates https://python.langchain.com/v0.2/docs/integrations/chat/huggingface/#2-instantiate-the-chathuggingface-to-apply-chat-templates

Note that until a release of langchain-huggingface beyond 0.0.3 is made, you need to apply the following commit manually because of #22804: 4796b7e

Can you please specify? I am learning Langchain and I have had to just use pipeline directory instead of the HuggingFacePipeLine or ChatHuggingFace abstractions. Still trying to figure out to intregrate this with the rest on Langchain.

Soumil32 commented 3 months ago

I believe I fixed the issue! I detailed pull request will be submitted tomorrow! Everything should work through ChatHuggingFace in my fork

dosubot[bot] commented 1 week ago

Hi, @bibhas2. I'm helping the LangChain team manage their backlog and am marking this issue as stale.

Your issue regarding the HuggingFacePipeline class not utilizing the chat template feature has been noted, and users have suggested using ChatHuggingFace as a workaround. Additionally, there seems to be progress on a pull request to resolve this issue, as indicated by Soumil32.

Could you please let us know if this issue is still relevant to the latest version of the LangChain repository? If it is, feel free to comment here to keep it open. Otherwise, you can close it yourself, or it will be automatically closed in 7 days. Thank you!

breadbread1984 commented 17 hours ago

my solution

#!/usr/bin/python3

from typing import Any, Union
from collections.abc import Sequence
from transformers import AutoTokenizer
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.messages import HumanMessage, AIMessage, SystemMessage
from langchain_core.prompt_values import ChatPromptValue, PromptValue

# NOTE: langchain-core >= 0.3.15

class HFChatPromptValue(ChatPromptValue):
  tokenizer: Any = None
  def to_string(self) -> str:
    hf_messages = []
    for m in self.messages:
      if isinstance(m, HumanMessage):
        role = 'user'
      elif isinstance(m, AIMessage):
        role = 'assistant'
      elif isinstance(m, SystemMessage):
        role = 'system'
      else:
        raise Exception(f'Got unsupported message type: {m}')
      hf_messages.append({'role': role, 'content': m.content})
    return self.tokenizer.apply_chat_template(hf_messages, tokenize = False, add_generation_prompt = True)

class HFChatPromptTemplate(ChatPromptTemplate):
  tokenizer: Any = None
  def format_prompt(self, **kwargs: Any) -> PromptValue:
    messages = self.format_messages(**kwargs)
    return HFChatPromptValue(messages = messages, tokenizer = self.tokenizer)
  async def aformat_prompt(self, **kwargs: Any) -> PromptValue:
    messages = await self.format_messages(**kwargs)
    return HFChatPromptValue(messages = messages, tokenizer = self.tokenizer)

if __name__ == "__main__":
  messages = [
    ('system', "you are a helpful AI bot. Your name is {name}"),
    MessagesPlaceholder('chat_history'),
    ('human', '{user_input}')
  ]
  tokenizer = AutoTokenizer.from_pretrained('Qwen/Qwen2.5-7B-Instruct')
  prompt = HFChatPromptTemplate(messages, tokenizer = tokenizer)
  txt = prompt.format(**{'user_input': 'what is your name', 'name': 'robot test', 'chat_history': [HumanMessage(content = 'your name is awesome!'), AIMessage(content = 'Thanks!')]})
  print(txt)

dosubot[bot] commented 17 hours ago

Thank you for your detailed response and for sharing your solution! We appreciate your contribution to the discussion. We'll go ahead and close this issue now.

langchain-ai / langchain