explodinggradients / ragas

Supercharge Your LLM Application Evaluations šŸš€
https://docs.ragas.io
Apache License 2.0
6.94k stars 697 forks source link

Empty generation does not raise a `ragas.exceptions.ExceptionInRunner` #1137

Open Gwenn-LR opened 2 months ago

Gwenn-LR commented 2 months ago

[x] I have checked the documentation and related resources and couldn't resolve my bug.

Describe the bug Hi! I'm currently working with ragas to test different RAG architectures, so I'm using Ollama, HuggingFace and LangChain framework on top of ragas and I'm facing an issue when I'm trying to implement an unit test around the synthetic generation : the generated ragas.testset.generator.TestsetGenerator object has empty rows.

I think it comes from a specific parametrization of each frameworks but after having looked over the repository, I think you've tried to avoid such a situation but your checking does not raise any error.

Ragas version: 0.1.11 Python version: 3.10.12

Code to Reproduce The example.pdf file used here can be found at: https://css4.pub/2015/usenix/example.pdf

from langchain_community.chat_models import ChatOllama
from ragas.testset import TestsetGenerator
from rag_sandbox.embedding.huggingface import HuggingFaceEmbeddings
from langchain_community.document_loaders.pdf import UnstructuredPDFLoader

if __name__ == "__main__":
    llm = ChatOllama(base_url="http://localhost:11434", model="qwen2:7b")

    generator = TestsetGenerator.from_langchain(
        generator_llm=llm,
        critic_llm=llm,
        embeddings=HuggingFaceEmbeddings()
    )

    documents = UnstructuredPDFLoader(
        "./tests/data/pdf/example.pdf",
        mode="elements",
        strategy="hi_res",
        infer_table_structure=True,
        hi_res_model_name="yolox",
        extract_images_in_pdf=True,
        extract_image_block_output_dir="./results/data_extraction/images"
        ).load()

    dataset = generator.generate_with_langchain_docs(
        documents=documents,
        test_size=1
    )

Error trace No error, but that is the problem.

Expected behavior According to your code, one can expect to get a ragas.exceptions.ExceptionInRunner in such a situation.

Additional context I'll offer a PR to fix this issue, but I don't know if it won't conflict with another part of the code.

dosubot[bot] commented 2 months ago

To resolve the issue of the ragas.testset.generator.TestsetGenerator object generating empty rows, you should ensure that the generate method is correctly initializing and executing the evolutions. Here are a few steps to check and potentially resolve the issue:

  1. Check Document Addition: Ensure that documents are being correctly added to the docstore. This is done in the generate_with_llamaindex_docs and generate_with_langchain_docs methods.

  2. Validate Distributions: Ensure that the distributions passed to the generate method sum to 1.0. This is validated in the generate method.

  3. Initialize Evolutions: Ensure that evolutions are correctly initialized in the init_evolution method.

  4. Execution of Evolutions: Ensure that the evolutions are being correctly executed and that the Executor is handling the tasks properly.

Here is a snippet of the generate method that you should review and ensure is functioning correctly:

def generate(
    self,
    test_size: int,
    distributions: t.Optional[Distributions] = None,
    with_debugging_logs=False,
    is_async: bool = True,
    raise_exceptions: bool = True,
    run_config: t.Optional[RunConfig] = None,
):
    distributions = distributions or DEFAULT_DISTRIBUTION
    # validate distributions
    if not check_if_sum_is_close(list(distributions.values()), 1.0, 3):
        raise ValueError(
            f"distributions passed do not sum to 1.0 [got {sum(list(distributions.values()))}]. Please check the "
            f"distributions."
        )

    # configure run_config for docstore
    if run_config is None:
        run_config = RunConfig(max_retries=15, max_wait=90)
    self.docstore.set_run_config(run_config)

    # init filters and evolutions
    for evolution in distributions:
        self.init_evolution(evolution)
        evolution.init(is_async=is_async, run_config=run_config)

    if with_debugging_logs:
        from ragas.utils import patch_logger

        patch_logger("ragas.testset.evolutions", logging.DEBUG)
        patch_logger("ragas.testset.extractor", logging.DEBUG)
        patch_logger("ragas.testset.filters", logging.DEBUG)
        patch_logger("ragas.testset.docstore", logging.DEBUG)
        patch_logger("ragas.llms.prompt", logging.DEBUG)

    exec = Executor(
        desc="Generating",
        keep_progress_bar=True,
        raise_exceptions=raise_exceptions,
        run_config=run_config,
    )
    current_nodes = [
        CurrentNodes(root_node=n, nodes=[n])
        for n in self.docstore.get_random_nodes(k=test_size)
    ]
    total_evolutions = 0
    for evolution, probability in distributions.items():
        for i in sample(range(test_size), round(probability * test_size)):
            exec.submit(
                evolution.evolve,
                current_nodes[i],
                name=f"{evolution.__class__.__name__}-{i}",
            )
            total_evolutions += 1
    if total_evolutions <= test_size:
        filler_evolutions = choices(
            list(distributions), k=test_size - total_evolutions
        )
        for evolution in filler_evolutions:
            exec.submit(
                evolution.evolve,
                current_nodes[total_evolutions],
                name=f"{evolution.__class__.__name__}-{total_evolutions}",
            )
            total_evolutions += 1

    try:
        test_data_rows = exec.results()
        if not test_data_rows:
            raise ExceptionInRunner()

    except ValueError as e:
        raise e
    # make sure to ignore any NaNs that might have been returned
    # due to failed evolutions. MaxRetriesExceeded is a common reason
    test_data_rows = [r for r in test_data_rows if not is_nan(r)]
    test_dataset = TestDataset(test_data=test_data_rows)
    evol_lang = [get_feature_language(e) for e in distributions]
    evol_lang = [e for e in evol_lang if e is not None]
    track(
        TestsetGenerationEvent(
            event_type="testset_generation",
            evolution_names=[e.__class__.__name__.lower() for e in distributions],
            evolution_percentages=[distributions[e] for e in distributions],
            num_rows=len(test_dataset.test_data),
            language=evol_lang[0] if len(evol_lang) > 0 else "",
        )
    )

    return test_dataset

Ensure that the Executor is correctly handling the evolution tasks and that the test_data_rows are being populated correctly. If the test_data_rows are empty, it could indicate an issue with the evolution process or the document addition to the docstore [1].

To continue talking to Dosu, mention @dosu.