explodinggradients / ragas

Supercharge Your LLM Application Evaluations 🚀
https://docs.ragas.io
Apache License 2.0
6.96k stars 697 forks source link

Making custom LLMs compatible with TestsetGenerator? #230

Closed Data-drone closed 4 months ago

Data-drone commented 11 months ago

Hi Team,

I am trying to get the TestsetGenerator to work properly with a Langchain custom LLM.

It's _call method current expects a string prompt but it gets a ChatPromptTemplate from ragas. what is the best way to handle this?

Data-drone commented 11 months ago

I got things working by just running format() against the prompt my LLM received if it was in ChatPromptTemplate format but not sure that is the best approach.

I have set the chat_qa=0.0 so shouldn't all the prompts be coming in as string?

shahules786 commented 11 months ago

Hey @Data-drone , can you share the ragas & python version you're using?

Data-drone commented 11 months ago

ragas==0.0.18 python 3.10.12

shahules786 commented 11 months ago

Hi @Data-drone , We will look into the langchain custom llms issue. Regarding setting chat_qa=0, it is used to control if the test set should contain conversational questions or not.

shahules786 commented 11 months ago

Hi @Data-drone , which model are you using? Are you using an instruction-tuned model (that is not chat based)?

Data-drone commented 11 months ago

Vicuna 1.5 13b

jjmachan commented 11 months ago

hey @Data-drone , so we do have this snipped inside RagasLLMs that should handle this logic for you

        if isinstance(self.llm, BaseLLM):
            ps = [p.format() for p in prompts]
            result = self.llm.generate(ps, callbacks=callbacks)
        else:  # if BaseChatModel
            ps = [p.format_messages() for p in prompts]
            result = self.llm.generate(ps, callbacks=callbacks)

if your custom model is a base class of either, it should do the conversation for you. but not sure why its not working here. Could you show us how you're calling the testsetgenerator and custom models code?

msunkarahend commented 11 months ago

@shahules786 @jjmachan In the same context, TestsetGenerator uses cross-encoder/stsb-TinyBERT-L-4/ from hugging face. Is there a way, i can avoid using that and just use gpt models from azure open ai alone and generate synthetic dataset?

jjmachan commented 10 months ago

hey @msunkarahend thanks for bringing that up. what you can do is change it via the class

class TestsetGenerator:

    """
    Ragas Test Set Generator

    Attributes
    ----------
    generator_llm: LangchainLLM
        LLM used for all the generator operations in the TestGeneration paradigm.
    critique_llm: LangchainLLM
        LLM used for all the filtering and scoring operations in TestGeneration
        paradigm.
    embeddings_model: Embeddings
        Embeddings used for vectorizing nodes when required.
    chat_qa: float
        Determines the fraction of conversational questions the resulting test set.
    chunk_size: int
        The chunk size of nodes created from data.
    test_distribution : dict
        Distribution of different types of questions to be generated from given
        set of documents. Defaults to {"easy":0.1, "reasoning":0.4, "conversation":0.5}
    """

    def __init__(
        self,
        generator_llm: RagasLLM,
        critic_llm: RagasLLM,
        embeddings_model: Embeddings,
        testset_distribution: t.Optional[t.Dict[str, float]] = None,
        chat_qa: float = 0.0,
        chunk_size: int = 1024,
        seed: int = 42,
    ) -> None:

so if you give a new different Embedding instance you will be able to change it.

let me know if you want the finished snippet and I'll share that too 😄

jjmachan commented 10 months ago

but @shahules786 probably we should change from_defaults class method to with_openai or something like that to make it easier?

msunkarahend commented 10 months ago

@jjmachan If you can share the finished snippet for the TestsetGenerator, it would be great. thanks in advance.

hbj52 commented 9 months ago

I notice that the if downloaded from source, the version of ragas on github is 0.0.23.dev44+g506ad60, while the latest version I down load from pip is 0.0.22, same as shown "latest realease" on github. And it seems like 0.0.22 will meet similar question about list(str) and list(ChatPromptTemplate).

mspronesti commented 7 months ago

Could you guys try #670?