benman1 / generative_ai_with_langchain

Build large language model (LLM) apps with Python, ChatGPT and other models. This is the companion repository for the book on generative AI with LangChain.
https://amzn.to/43PuIkQ
MIT License
600 stars 242 forks source link

Chapter 3, Hugging Face Transformers #21

Closed NPPprojects closed 8 months ago

NPPprojects commented 8 months ago

Starting from page 86, the following example is shown

from transformers import pipeline
import torch
generate_text = pipeline(
model="aisquared/dlite-v1-355m",
torch_dtype=torch.bfloat16,
trust_remote_code=True,
device_map="auto",
framework="pt"
)
generate_text("In this chapter, we'll discuss first steps with generative
AI in Python.")

Followed up with the following code block utilizing PromptTemplate and LLMChains:

from langchain import PromptTemplate, LLMChain
template = """Question: {question}
Answer: Let's think step by step."""
prompt = PromptTemplate(template=template, input_variables=["question"])
llm_chain = LLMChain(prompt=prompt, llm=generate_text)
question = "What is electroencephalography?"
print(llm_chain.run(question))

Running this code produces the following error:

1 validation error for LLMChain
llm
value is not a valid dict (type=type_error.dict)

I'm guessing LLMChain isn't compatible with transformers pipeline as this code works fine if "generate_text" is switched out with an OpenAI model. Installed packages into a venv via requirements.txt and pip install transformers accelerate torch. The original requirements.txt doesn't include transformers as far as I can tell. However, this is written in the book on page 86:

I haven’t included accelerate in the main requirements, but I’ve included the
transformers library. If you don’t have all libraries installed, make sure you execute
this command:
pip install transformers accelerate torch
benman1 commented 8 months ago

Hi @NPPprojects. Thanks for reporting this. I remember there were quite a few changes to LLMChain and local pipelines, and I might have missed testing this code for the pinned version.

I haven't run this yet, but have you tried wrapping the transformers pipeline with a HuggingFacePipeline?

from langchain.llms import HuggingFacePipeline
hfp = HuggingFacePipeline(pipeline= generate_text)

Later, instead of using generate_text you'd use hfp, like this:

llm_chain = LLMChain(prompt=prompt, llm=hfp)

Please let me know if this works - I'll try in the evening.

As for the transformers library, it's not directly included in the requirements, but it's required by sentence-transformers, which is included in the requirements.

benman1 commented 8 months ago

@NPPprojects! Hi again! I struggled to make this work - the pipeline was very slow for me until I switched to GPU on Google Colab. I've tried with different versions of LangChain and transformers, but the only thing that I've found to work was adapting the HuggingFacePipeline class.

This here works for me with the current LangChain version:

from typing import Any, List, Mapping, Optional

from langchain.llms import HuggingFacePipeline
from langchain_core.outputs import Generation, LLMResult
from langchain_core.callbacks import CallbackManagerForLLMRun
from langchain_community.llms.utils import enforce_stop_tokens

VALID_TASKS = ("text2text-generation", "text-generation", "summarization")

class HFP(HuggingFacePipeline):
      def _generate(
        self,
        prompts: List[str],
        stop: Optional[List[str]] = None,
        run_manager: Optional[CallbackManagerForLLMRun] = None,
        **kwargs: Any,
    ) -> LLMResult:
        # List to hold all results
        text_generations: List[str] = []

        for i in range(0, len(prompts), self.batch_size):
            batch_prompts = prompts[i : i + self.batch_size]

            # Process batch of prompts
            responses = self.pipeline(batch_prompts)

            # Process each response in the batch
            for j, response in enumerate(responses):
                if isinstance(response, list):
                    # if model returns multiple generations, pick the top one
                    response = response[0]

                if self.pipeline.task == "text-generation":
                    try:
                        from transformers.pipelines.text_generation import ReturnType

                        remove_prompt = (
                            self.pipeline._postprocess_params.get("return_type")
                            != ReturnType.NEW_TEXT
                        )
                    except Exception as e:
                        logger.warning(
                            f"Unable to extract pipeline return_type. "
                            f"Received error:\n\n{e}"
                        )
                        remove_prompt = True
                    if remove_prompt:
                        text = response[len(batch_prompts[j]) :]
                        # ["generated_text"]
                    else:
                        text = response
                        # ["generated_text"]
                elif self.pipeline.task == "text2text-generation":
                    text = response["generated_text"]
                elif self.pipeline.task == "summarization":
                    text = response["summary_text"]
                else:
                    raise ValueError(
                        f"Got invalid task {self.pipeline.task}, "
                        f"currently only {VALID_TASKS} are supported"
                    )
                if stop:
                    # Enforce stop tokens
                    text = enforce_stop_tokens(text, stop)

                # Append the processed text to results
                text_generations.append(text)

        return LLMResult(
            generations=[[Generation(text=text)] for text in text_generations]
        )

hfp = HFP(pipeline= generate_text)
llm_chain = LLMChain(prompt=prompt, llm=hfp)
question = "What is electroencephalography?"
print(llm_chain.run(question))
NPPprojects commented 8 months ago

Hi @NPPprojects. Thanks for reporting this. I remember there were quite a few changes to LLMChain and local pipelines, and I might have missed testing this code for the pinned version.

I haven't run this yet, but have you tried wrapping the transformers pipeline with a HuggingFacePipeline?

from langchain.llms import HuggingFacePipeline
hfp = HuggingFacePipeline(pipeline= generate_text)

Later, instead of using generate_text you'd use hfp, like this:

llm_chain = LLMChain(prompt=prompt, llm=hfp)

Please let me know if this works - I'll try in the evening.

As for the transformers library, it's not directly included in the requirements, but it's required by sentence-transformers, which is included in the requirements.

Modifying the code produced the following error:

text = response[0]["generated_text"][len(prompt) :]
TypeError: string indices must be integers

Thrown by huggingface_pipeline.py

Currently just going through the textbook and testing the samples before I update to the latest LangChain version. This is a pretty rudimentary function where I'm not even too certain LangChain is all that useful as something like this works fine, without introducing any abstraction:

generate_text = pipeline(
    model="aisquared/dlite-v1-355m",
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
    device_map="auto",
    framework="pt",
    token= 1500
)
#No Langchain Abstraction
question = "What is electroencephalography?"
template = f"""Question: {question}
Answer: Let's think step by step."""
print(generate_text(template))
benman1 commented 8 months ago

Hi @NPPprojects,

Modifying the code produced the following error:

text = response[0]["generated_text"][len(prompt) :] TypeError: string indices must be integers

That's why I wrote the HFP class, see above ;)

It should be easy enough to adapt it to any previous LC version. You are right though - depending on the task, you might not need any LC abstraction at all.