Custom LLM Implementation

VpkPrasanna commented 10 months ago

Feature request

Integrate other models from hugging face models models (likely decoder based models) as a LLM , which we can use for RAG applications

Motivation

Since the model weights of LLM are huge we can have our own model trained on small dataset they can be pluged into RAG applications

Your contribution

need to think of it actually how can i able to contribute to this

dosubot[bot] commented 10 months ago

🤖

Hello @VpkPrasanna,

It's good to see you again and thank you for your feature request.

Currently, the LangChain framework does not directly support the integration of BERT-based models from Hugging Face. The existing HuggingFaceHub class only supports text-generation, text2text-generation, and summarization tasks as indicated by the VALID_TASKS variable and the error messages in the code.

However, LangChain's flexible architecture allows for the integration of various language models. To integrate a BERT-based model, you would need to create a new class similar to the HuggingFacePipeline class. This new class would handle the specifics of loading and running the BERT-based model.

Here is a simplified example of what this class might look like:

from langchain.llms.base import BaseLLM
from transformers import AutoTokenizer, AutoModelForQuestionAnswering

class BERTLLM(BaseLLM):
    def __init__(self, model_name):
        self.tokenizer = AutoTokenizer.from_pretrained(model_name)
        self.model = AutoModelForQuestionAnswering.from_pretrained(model_name)

    def generate(self, prompt):
        inputs = self.tokenizer(prompt, return_tensors='pt')
        outputs = self.model(**inputs)
        return outputs

Then, you would need to add a new import function for this class in __init__.py and add it to the get_type_to_cls_dict function. This would allow LangChain to recognize and use your new BERT-based LLM.

Please note that this is a simplified example and the actual implementation may require additional code to handle the specifics of your application and the BERT-based model.

I hope this helps! If you have any further questions or need more clarification, feel free to ask.

Best, Dosu

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

VpkPrasanna commented 10 months ago

@dosu-bot it cannot be so specifc to BERT based models it can be any decoder models basically basically some text generation models or summarization models

dosubot[bot] commented 10 months ago

🤖

Thank you for your detailed feature request. I understand that you want to extend the LangChain framework to support other models from Hugging Face, specifically decoder-based models for RAG applications.

Based on the current structure of the LangChain codebase, you can achieve this by modifying the from_model_id method in the HuggingFacePipeline class. This method currently supports AutoModelForCausalLM and AutoModelForSeq2SeqLM models. To support any decoder models, you would need to import the appropriate model class from the transformers library and add a condition to instantiate the model based on the model_id or some other parameter.

Here is an example of how you might modify the from_model_id method to support BartForConditionalGeneration models:

from transformers import BartForConditionalGeneration

@classmethod
def from_model_id(
    cls,
    model_id: str,
    task: str,
    device: Optional[int] = -1,
    model_kwargs: Optional[dict] = None,
    pipeline_kwargs: Optional[dict] = None,
    batch_size: int = DEFAULT_BATCH_SIZE,
    **kwargs: Any,
) -> HuggingFacePipeline:
    # ... existing code ...

    try:
        if task == "text-generation":
            model = AutoModelForCausalLM.from_pretrained(model_id, **_model_kwargs)
        elif task in ("text2text-generation", "summarization"):
            model = AutoModelForSeq2SeqLM.from_pretrained(model_id, **_model_kwargs)
        elif task == "conditional-generation":
            model = BartForConditionalGeneration.from_pretrained(model_id, **_model_kwargs)
        else:
            raise ValueError(
                f"Got invalid task {task}, "
                f"currently only {VALID_TASKS} are supported"
            )
    except ImportError as e:
        raise ValueError(
            f"Could not load the {task} model due to missing dependencies."
        ) from e

    # ... existing code ...

In this example, I've added a condition to instantiate a BartForConditionalGeneration model if the task is "conditional-generation". You would need to add similar conditions for other decoder models you want to support.

Please note that you would also need to handle the output of these models in the _generate method, as it currently only handles the output of AutoModelForCausalLM and AutoModelForSeq2SeqLM models.

I hope this helps! If you have any further questions or need more clarification, feel free to ask.

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

VpkPrasanna commented 10 months ago

@dosu-bot is it possible to apply the same logic via customLLM class from langchain

dosubot[bot] commented 10 months ago

🤖

To extend the LangChain Python framework to support other models from Hugging Face, specifically decoder-based models for RAG applications, you can indeed modify the from_model_id method in the HuggingFacePipeline class. Here's a sample code snippet on how you can achieve this:

from transformers import RagTokenizer, RagTokenForGeneration

@classmethod
def from_model_id(
    cls,
    model_id: str,
    task: str,
    device: Optional[int] = -1,
    model_kwargs: Optional[dict] = None,
    pipeline_kwargs: Optional[dict] = None,
    batch_size: int = DEFAULT_BATCH_SIZE,
    **kwargs: Any,
) -> HuggingFacePipeline:
    """Construct the pipeline object from model_id and task."""
    try:
        from transformers import (
            AutoModelForCausalLM,
            AutoModelForSeq2SeqLM,
            AutoTokenizer,
            RagTokenizer,
            RagTokenForGeneration,
        )
        from transformers import pipeline as hf_pipeline

    except ImportError:
        raise ValueError(
            "Could not import transformers python package. "
            "Please install it with `pip install transformers`."
        )

    _model_kwargs = model_kwargs or {}
    if task == "rag":
        tokenizer = RagTokenizer.from_pretrained(model_id, **_model_kwargs)
        model = RagTokenForGeneration.from_pretrained(model_id, **_model_kwargs)
    else:
        tokenizer = AutoTokenizer.from_pretrained(model_id, **_model_kwargs)

    try:
        if task == "text-generation":
            model = AutoModelForCausalLM.from_pretrained(model_id, **_model_kwargs)
        elif task in ("text2text-generation", "summarization"):
            model = AutoModelForSeq2SeqLM.from_pretrained(model_id, **_model_kwargs)
        elif task == "rag":
            pass  # Model already defined above
        else:
            raise ValueError(
                f"Got invalid task {task}, "
                f"currently only {VALID_TASKS} are supported"
            )
    except ImportError as e:
        raise ValueError(
            f"Could not load the {task} model due to missing dependencies."
        ) from e
    # Rest of the code remains the same

In this code, we added a new condition to handle the "rag" task. If the task is "rag", we use the RagTokenizer and RagTokenForGeneration classes from the transformers library to create the tokenizer and model.

Please note that you also need to handle the "rag" task in the _generate method and add "rag" to the VALID_TASKS list.

As for the customLLM class, it is an abstract base class that provides a simpler interface for working with Language Learning Models (LLMs). It is designed to be subclassed by other classes that implement the specific functionality of a particular LLM. However, it does not directly interact with the Hugging Face models or the from_model_id method in the HuggingFacePipeline class. Therefore, it might not be the best place to add support for the RAG models. Instead, you should focus on modifying the HuggingFacePipeline class as described above.

I hope this helps! If you have any further questions, feel free to ask.

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

VpkPrasanna commented 10 months ago

@dosu-bot can you re-write the function based on LLM as base class ?

dosubot[bot] commented 10 months ago

🤖

To extend the LangChain Python framework to support other models from Hugging Face, specifically decoder-based models for RAG applications, you can modify the from_model_id method in the HuggingFacePipeline class. Here's a sample code snippet on how to rewrite the function:

@classmethod
def from_model_id(
    cls,
    model_id: str,
    task: str,
    device: Optional[int] = -1,
    device_map: Optional[str] = None,
    model_kwargs: Optional[dict] = None,
    pipeline_kwargs: Optional[dict] = None,
    batch_size: int = DEFAULT_BATCH_SIZE,
    **kwargs: Any,
) -> HuggingFacePipeline:
    """Construct the pipeline object from model_id and task."""
    try:
        from transformers import (
            AutoModelForCausalLM,
            AutoModelForSeq2SeqLM,
            AutoModelForRAG,  # Import the RAG model
            AutoTokenizer,
        )
        from transformers import pipeline as hf_pipeline

    except ImportError:
        raise ValueError(
            "Could not import transformers python package. "
            "Please install it with `pip install transformers`."
        )

    _model_kwargs = model_kwargs or {}
    tokenizer = AutoTokenizer.from_pretrained(model_id, **_model_kwargs)

    try:
        if task == "text-generation":
            model = AutoModelForCausalLM.from_pretrained(model_id, **_model_kwargs)
        elif task in ("text2text-generation", "summarization"):
            model = AutoModelForSeq2SeqLM.from_pretrained(model_id, **_model_kwargs)
        elif task == "rag":  # Add a condition for the RAG task
            model = AutoModelForRAG.from_pretrained(model_id, **_model_kwargs)
        else:
            raise ValueError(
                f"Got invalid task {task}, "
                f"currently only {VALID_TASKS} are supported"
            )
    except ImportError as e:
        raise ValueError(
            f"Could not load the {task} model due to missing dependencies."
        ) from e

    # Rest of the method remains the same...

In this modification, I've added an import for AutoModelForRAG and a condition to instantiate it when the task is "rag". You'll also need to add "rag" to VALID_TASKS at the top of your file:

VALID_TASKS = ("text2text-generation", "text-generation", "summarization", "rag")

This is a basic implementation and might need further adjustments based on the specific requirements of your application. You can refer to the HuggingFacePipeline class for more details.

As for the LLM base class, it is an abstract base class that provides a common interface for large language models (LLMs). It has several attributes and methods that you can use to extend its functionality. You can refer to the BaseLLM class for more details.

I hope this helps! If you have any other questions, feel free to ask.

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

dosubot[bot] commented 7 months ago

Hi, @VpkPrasanna,

I'm helping the LangChain team manage their backlog and am marking this issue as stale. From what I understand, the issue you raised requests the integration of other decoder-based models from Hugging Face into a custom LLM for RAG applications. Dosubot provided detailed guidance on modifying the HuggingFacePipeline class to support these models, addressed your inquiries, and provided code snippets for reference. The issue has been resolved with Dosubot's assistance.

Could you please confirm if this issue is still relevant to the latest version of the LangChain repository? If it is, please let the LangChain team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.

Thank you!

javaidiqbal11 commented 1 month ago

Hugging face have multiple models that can be integrate in the custom LLM, here is the supporting material:

Building Custom Model

langchain-ai / langchain