explodinggradients / ragas

Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines
https://docs.ragas.io
Apache License 2.0
6.54k stars 640 forks source link

Local LLM with Ragas evaluation issue #1100

Open SalwaMostafa opened 1 month ago

SalwaMostafa commented 1 month ago

[ ] I have checked the documentation and related resources and couldn't resolve my bug.

Describe the bug I am trying to use a local LLM in the evaluate function., where the LLM is imported from Langchain but it gives this error and I do not understand what I should do ? Should I use these wrappers

"langchain_llm = LangchainLLMWrapper(langchain_llm) langchain_embeddings = LangchainEmbeddingsWrapper(langchain_embeddings)"

Evaluating: 0%| | 0/2 [00:00<?, ?it/s]GGML_ASSERT: /run/nvme/job_3577603/tmp/pip-install-aob_qv9b/llama-cpp-python_08d8f9c1210e4408948636399a5c41c8/vendor/llama.cpp/ggml/src/ggml.c:6142: mask->ne[0] == a->ne[0] [New LWP 3089486] [New LWP 3089488] [New LWP 3089489] [New LWP 3089490] [New LWP 3089491] [New LWP 3089492] [New LWP 3089493] [New LWP 3089494] [New LWP 3089495] [New LWP 3089509] [New LWP 3089510] [New LWP 3089512] [New LWP 3089513] [New LWP 3089514] [New LWP 3089515] [New LWP 3089516] [New LWP 3089517] [New LWP 3089518] [New LWP 3089522] [New LWP 3089523] [New LWP 3089534] [New LWP 3089537] [New LWP 3089538] [New LWP 3089539] [New LWP 3089540] [New LWP 3089541] [New LWP 3089542] [New LWP 3089544] [New LWP 3089545] [New LWP 3089546] [New LWP 3089547] [New LWP 3089548] [New LWP 3089549] [New LWP 3089550] [New LWP 3089551] [New LWP 3089552] [New LWP 3089553] [New LWP 3089554] [New LWP 3089555] [New LWP 3089556] [New LWP 3089557] [New LWP 3089558] [New LWP 3089559] [New LWP 3089560] [New LWP 3089561] [New LWP 3089562] [New LWP 3089563] [New LWP 3089564] [New LWP 3089565] [New LWP 3089566] [New LWP 3089567] [New LWP 3089568] [New LWP 3089569] [New LWP 3089570] [New LWP 3089571] [New LWP 3089572] [New LWP 3089573] [New LWP 3089574] [New LWP 3089575] [New LWP 3089576] [New LWP 3089577] [New LWP 3089578] [New LWP 3089579] [New LWP 3089580] [New LWP 3089581] [New LWP 3089582] [New LWP 3089583] [New LWP 3089584] [New LWP 3089585] [New LWP 3089586] [New LWP 3089587] [New LWP 3089588] [New LWP 3089589] [New LWP 3089590] [New LWP 3089591] [New LWP 3089592] [New LWP 3089593] [New LWP 3089594] [New LWP 3089595] [New LWP 3089596] [New LWP 3089597] [New LWP 3089598] [New LWP 3089599] [New LWP 3089600] [New LWP 3089601] [New LWP 3089602] [New LWP 3089603] [New LWP 3089604] [New LWP 3089605] [New LWP 3089606] [New LWP 3089607] [New LWP 3089608] [New LWP 3089609] [New LWP 3089610] [New LWP 3089611] [New LWP 3089612] [New LWP 3089613] [New LWP 3089614] [New LWP 3089615] [New LWP 3089616] [New LWP 3089617] [New LWP 3089618] [New LWP 3089619] [New LWP 3089620] [New LWP 3089621] [New LWP 3089622] [New LWP 3089623] [New LWP 3089624] [New LWP 3089625] [New LWP 3089626] [New LWP 3089627] [New LWP 3089628] [New LWP 3089629] [New LWP 3089630] [New LWP 3089631] [New LWP 3089632] [New LWP 3089633] [New LWP 3089634] [New LWP 3089635] [New LWP 3089636] [New LWP 3089637] [New LWP 3089638] [New LWP 3089639] [New LWP 3089640] [New LWP 3089641] [New LWP 3089642] [New LWP 3089643] [New LWP 3089644] [New LWP 3089645] [New LWP 3089646] [New LWP 3089647] [New LWP 3089648] [New LWP 3089649] [New LWP 3089650] [New LWP 3089651] [New LWP 3089652] [New LWP 3089653] [New LWP 3089654] [New LWP 3089655] [New LWP 3089656] [New LWP 3089657] [New LWP 3089658] [New LWP 3089659] [New LWP 3089660] [New LWP 3089661] [New LWP 3089662] [New LWP 3089663] [New LWP 3089664] [New LWP 3089665] [New LWP 3089666] [New LWP 3089667] [New LWP 3089668] [New LWP 3089669] [New LWP 3089670] [New LWP 3089671] [New LWP 3089672] [New LWP 3089673] [New LWP 3089674] [New LWP 3089675] [New LWP 3089676] [New LWP 3089677] [New LWP 3089678] [New LWP 3089679] [New LWP 3089680] [New LWP 3089681] [New LWP 3089682] [New LWP 3089683] [New LWP 3089684] [New LWP 3089685] [New LWP 3089686] [New LWP 3089687] [New LWP 3089688] [New LWP 3089689] [New LWP 3089690] [New LWP 3089691] [New LWP 3089692] [New LWP 3089693] [New LWP 3089694] [New LWP 3089695] [New LWP 3089696] [New LWP 3089697] [New LWP 3089698] [New LWP 3089699] [New LWP 3089700] [New LWP 3089701] [New LWP 3089702] [New LWP 3089703] [New LWP 3089704] [New LWP 3089705] [New LWP 3089706] [New LWP 3089707] [New LWP 3089708] [New LWP 3089709] [New LWP 3089710] [New LWP 3089711] [New LWP 3089712] [New LWP 3089713] [New LWP 3089714] [New LWP 3089715] [New LWP 3089716] [New LWP 3089717] [New LWP 3089718] [New LWP 3089719] [New LWP 3089720] [New LWP 3089721] [New LWP 3089722] [New LWP 3089723] [New LWP 3089724] [New LWP 3089725] [New LWP 3089726] [New LWP 3089727] [New LWP 3089728] [New LWP 3089729] [New LWP 3089730] [New LWP 3089731] [New LWP 3089732] [New LWP 3089733] [New LWP 3089734] [New LWP 3089735] [New LWP 3089736] [New LWP 3089737] [New LWP 3089738] [New LWP 3089739] [New LWP 3089740] [New LWP 3089741] [New LWP 3089742] [New LWP 3089743] [New LWP 3089744] [New LWP 3089745] [New LWP 3089746] [New LWP 3089747] [New LWP 3089748] [New LWP 3089749] [New LWP 3089750] [New LWP 3089751] [New LWP 3089752] [New LWP 3089753] [New LWP 3089754] [New LWP 3089755] [New LWP 3089756] [New LWP 3089757] [New LWP 3089758] [New LWP 3089759] [New LWP 3089760] [New LWP 3089761] [New LWP 3089762] [New LWP 3089763] [New LWP 3089764] [New LWP 3089765] [New LWP 3089766] [New LWP 3089767] [New LWP 3089768] [New LWP 3089769] [New LWP 3089770] [New LWP 3089771] [New LWP 3089772] [New LWP 3089773] [New LWP 3089774] [New LWP 3089775] [New LWP 3089776] [New LWP 3089777] [New LWP 3089778] [New LWP 3089779] [New LWP 3089780] [New LWP 3089781] [New LWP 3089782] [New LWP 3089783] [New LWP 3089784] [New LWP 3089785] [New LWP 3089786] [New LWP 3089787] [New LWP 3089788] [New LWP 3089789] [New LWP 3089790] [New LWP 3089791] [New LWP 3089792] [New LWP 3089793] [New LWP 3089794] [New LWP 3089795] [New LWP 3089796] [New LWP 3089797] [New LWP 3089798] [New LWP 3089799] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". 0x00007fffbf2b8da6 in do_futex_wait.constprop () from /lib64/libpthread.so.0

0 0x00007fffbf2b8da6 in do_futex_wait.constprop () from /lib64/libpthread.so.0

1 0x00007fffbf2b8e98 in __new_sem_wait_slow.constprop.0 () from /lib64/libpthread.so.0

2 0x00007fffbf89c678 in PyThread_acquire_lock_timed () from /lib64/libpython3.9.so.1.0

3 0x00007fffbf89cdd9 in lock_PyThread_acquire_lock () from /lib64/libpython3.9.so.1.0

4 0x00007fffbf8aef4f in method_vectorcall_VARARGS_KEYWORDS () from /lib64/libpython3.9.so.1.0

5 0x00007fffbf8f8248 in _PyEval_EvalFrameDefault () from /lib64/libpython3.9.so.1.0

6 0x00007fffbf8bc465 in _PyFunction_Vectorcall () from /lib64/libpython3.9.so.1.0

7 0x00007fffbf8f8248 in _PyEval_EvalFrameDefault () from /lib64/libpython3.9.so.1.0

8 0x00007fffbf8bc465 in _PyFunction_Vectorcall () from /lib64/libpython3.9.so.1.0

9 0x00007fffbf8f8248 in _PyEval_EvalFrameDefault () from /lib64/libpython3.9.so.1.0

10 0x00007fffbf87c693 in function_code_fastcall () from /lib64/libpython3.9.so.1.0

11 0x00007fffbf8bc0ea in _PyFunction_Vectorcall () from /lib64/libpython3.9.so.1.0

12 0x00007fffbf8f8248 in _PyEval_EvalFrameDefault () from /lib64/libpython3.9.so.1.0

13 0x00007fffbf8bc465 in _PyFunction_Vectorcall () from /lib64/libpython3.9.so.1.0

14 0x00007fffbf8f8dfd in _PyEval_EvalFrameDefault () from /lib64/libpython3.9.so.1.0

15 0x00007fffbf8b8713 in _PyEval_EvalCode () from /lib64/libpython3.9.so.1.0

16 0x00007fffbf8b970f in _PyEval_EvalCodeWithName () from /lib64/libpython3.9.so.1.0

17 0x00007fffbf8b9743 in PyEval_EvalCode () from /lib64/libpython3.9.so.1.0

18 0x00007fffbf96adad in run_eval_code_obj () from /lib64/libpython3.9.so.1.0

19 0x00007fffbf97eb0a in run_mod () from /lib64/libpython3.9.so.1.0

20 0x00007fffbf80e2f6 in pyrun_file.cold () from /lib64/libpython3.9.so.1.0

21 0x00007fffbf97f325 in PyRun_SimpleFileExFlags () from /lib64/libpython3.9.so.1.0

22 0x00007fffbf97f7d2 in Py_RunMain () from /lib64/libpython3.9.so.1.0

23 0x00007fffbf97f919 in Py_BytesMain () from /lib64/libpython3.9.so.1.0

24 0x00007fffbe793d85 in __libc_start_main () from /lib64/libc.so.6

25 0x000055555555475e in _start ()

[Inferior 1 (process 3089452) detached] /appl/soft/ai/bin/apptainer_wrapper: line 38: 3089434 Aborted apptainer --silent exec $SING_FLAGS $SING_IMAGE "${@:2}"

Ragas version: Python version:

Code to Reproduce

from ragas.metrics import (answer_relevancy,faithfulness,context_recall,context_precision) from llama_index.embeddings.huggingface import HuggingFaceEmbedding from huggingface_hub import hf_hub_download, snapshot_download from langchain.callbacks.manager import CallbackManager from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler from langchain_community.llms import LlamaCpp

data_samples = { 'question': ['When was the first super bowl?', 'Who won the most super bowls?'], 'answer': ['The first superbowl was held on January 15, 1967', 'The most super bowls have been won by The New England Patriots'], 'contexts' : [['The Super Bowl....season since 1966,','replacing the NFL...in February.'], ['The Green Bay Packers...Green Bay, Wisconsin.','The Packers compete...Football Conference']], 'ground_truth': ['The first superbowl was held on January 15, 1967', 'The New England Patriots have won the Super Bowl a record six times'] } dataset = Dataset.from_dict(data_samples)

embedding_model_name = "sentence-transformers/msmarco-bert-base-dot-v5" embed_model = HuggingFaceEmbedding(model_name=embedding_model_name)

from langchain_core.language_models import BaseLanguageModel from langchain_core.embeddings import Embeddings

critic_llm = LlamaCpp( model_path="./Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf", n_gpu_layers=1, n_batch=512, n_ctx=2048, f16_kv=True, callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]), verbose=True, )

from ragas import evaluate

result_context_precision = evaluate(dataset,metrics=[context_precision], llm=critic_llm) result_context_recall = evaluate(dataset, metrics=[context_recall], llm=critic_llm)

results = result_context_precision | result_context_recall

dosubot[bot] commented 1 month ago

Hello, @SalwaMostafa! I'm a bot here to help you with bug fixes, questions, and contributions. Let's work together to resolve the issue you're facing.

To resolve the issue you're encountering when using a local LLM in the evaluate function with Langchain, you should indeed use the LangchainLLMWrapper and LangchainEmbeddingsWrapper. Here is how you can do it:

  1. Initialize your Langchain LLM and Embeddings:

    from langchain_community.llms import LlamaCpp
    from langchain_core.embeddings import HuggingFaceEmbedding
    
    critic_llm = LlamaCpp(
        model_path="./Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf",
        n_gpu_layers=1,
        n_batch=512,
        n_ctx=2048,
        f16_kv=True,
        callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]),
        verbose=True,
    )
    
    embedding_model_name = "sentence-transformers/msmarco-bert-base-dot-v5"
    embed_model = HuggingFaceEmbedding(model_name=embedding_model_name)
  2. Wrap the Langchain LLM and Embeddings with the appropriate wrappers:

    from ragas.llms import LangchainLLMWrapper
    from ragas.embeddings import LangchainEmbeddingsWrapper
    
    wrapped_llm = LangchainLLMWrapper(critic_llm)
    wrapped_embeddings = LangchainEmbeddingsWrapper(embed_model)
  3. Pass the wrapped LLM and Embeddings to the evaluate function:

    from ragas.metrics import context_precision, context_recall
    from ragas import evaluate
    from datasets import Dataset
    
    data_samples = {
        'question': ['When was the first super bowl?', 'Who won the most super bowls?'],
        'answer': ['The first superbowl was held on January 15, 1967', 'The most super bowls have been won by The New England Patriots'],
        'contexts' : [['The Super Bowl....season since 1966,','replacing the NFL...in February.'], 
        ['The Green Bay Packers...Green Bay, Wisconsin.','The Packers compete...Football Conference']],
        'ground_truth': ['The first superbowl was held on January 15, 1967', 'The New England Patriots have won the Super Bowl a record six times']
    }
    dataset = Dataset.from_dict(data_samples)
    
    result_context_precision = evaluate(dataset, metrics=[context_precision], llm=wrapped_llm, embeddings=wrapped_embeddings)
    result_context_recall = evaluate(dataset, metrics=[context_recall], llm=wrapped_llm, embeddings=wrapped_embeddings)
    
    results = result_context_precision | result_context_recall

By following these steps, you ensure that your custom LLM and Embeddings are correctly wrapped and used within the evaluate function. The LangchainLLMWrapper class in the ragas library is designed to adapt Langchain's BaseLanguageModel interface, implementing methods for both synchronous and asynchronous text generation, which should resolve the errors you are encountering [1][2][3].

To continue talking to Dosu, mention @dosu.

SalwaMostafa commented 1 month ago

@dosu I tried the above steps but still has same error

dosubot[bot] commented 1 month ago

To further troubleshoot the GGML_ASSERT error, you can try the following steps:

  1. Ensure Correct Wrapping of LLM and Embeddings: Make sure that the LLM and embeddings are correctly wrapped using LangchainLLMWrapper and LangchainEmbeddingsWrapper.

  2. Check for Compatibility Issues: Verify that there are no compatibility issues between the LlamaCpp model and the LangchainLLMWrapper. If there are known issues, they should be addressed first.

  3. Debugging the GGML_ASSERT Error: The GGML_ASSERT error typically indicates an issue with the internal state or configuration of the LlamaCpp model. Here are some common troubleshooting steps:

    • Check Model Path and Configuration: Ensure that the model path and configuration parameters (e.g., n_gpu_layers, n_batch, n_ctx, f16_kv) are correctly set.
    • Reduce Batch Size: Try reducing the batch size (n_batch) to see if it resolves the issue.
    • Update Dependencies: Ensure that all dependencies, including llama-cpp-python, are up to date.
  4. Example Code with Wrappers: Here is the complete example code with the necessary wrappers:

    from ragas.metrics import context_precision, context_recall
    from ragas import evaluate
    from datasets import Dataset
    from langchain_community.llms import LlamaCpp
    from langchain_core.embeddings import HuggingFaceEmbedding
    from langchain.callbacks.manager import CallbackManager
    from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
    from ragas.llms import LangchainLLMWrapper
    from ragas.embeddings import LangchainEmbeddingsWrapper
    
    # Initialize the LLM and Embeddings
    critic_llm = LlamaCpp(
       model_path="./Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf",
       n_gpu_layers=1,
       n_batch=512,
       n_ctx=2048,
       f16_kv=True,
       callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]),
       verbose=True,
    )
    
    embedding_model_name = "sentence-transformers/msmarco-bert-base-dot-v5"
    embed_model = HuggingFaceEmbedding(model_name=embedding_model_name)
    
    # Wrap the LLM and Embeddings
    wrapped_llm = LangchainLLMWrapper(critic_llm)
    wrapped_embeddings = LangchainEmbeddingsWrapper(embed_model)
    
    # Create the dataset
    data_samples = {
       'question': ['When was the first super bowl?', 'Who won the most super bowls?'],
       'answer': ['The first superbowl was held on January 15, 1967', 'The most super bowls have been won by The New England Patriots'],
       'contexts' : [['The Super Bowl....season since 1966,','replacing the NFL...in February.'], 
       ['The Green Bay Packers...Green Bay, Wisconsin.','The Packers compete...Football Conference']],
       'ground_truth': ['The first superbowl was held on January 15, 1967', 'The New England Patriots have won the Super Bowl a record six times']
    }
    dataset = Dataset.from_dict(data_samples)
    
    # Evaluate with wrapped LLM and Embeddings
    result_context_precision = evaluate(dataset, metrics=[context_precision], llm=wrapped_llm, embeddings=wrapped_embeddings)
    result_context_recall = evaluate(dataset, metrics=[context_recall], llm=wrapped_llm, embeddings=wrapped_embeddings)
    
    results = result_context_precision | result_context_recall

If the issue persists, it might be helpful to check for any specific configurations or parameters required for the LlamaCpp model to work seamlessly with the ragas evaluation framework [1].

To continue talking to Dosu, mention @dosu.

jjmachan commented 1 month ago

@SalwaMostafa were you able to fix this?

venkat-kittu commented 1 month ago

Hello, @SalwaMostafa! I'm a bot here to help you with bug fixes, questions, and contributions. Let's work together to resolve the issue you're facing.

To resolve the issue you're encountering when using a local LLM in the evaluate function with Langchain, you should indeed use the LangchainLLMWrapper and LangchainEmbeddingsWrapper. Here is how you can do it:

  1. Initialize your Langchain LLM and Embeddings:

    from langchain_community.llms import LlamaCpp
    from langchain_core.embeddings import HuggingFaceEmbedding
    
    critic_llm = LlamaCpp(
       model_path="./Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf",
       n_gpu_layers=1,
       n_batch=512,
       n_ctx=2048,
       f16_kv=True,
       callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]),
       verbose=True,
    )
    
    embedding_model_name = "sentence-transformers/msmarco-bert-base-dot-v5"
    embed_model = HuggingFaceEmbedding(model_name=embedding_model_name)
  2. Wrap the Langchain LLM and Embeddings with the appropriate wrappers:

    from ragas.llms import LangchainLLMWrapper
    from ragas.embeddings import LangchainEmbeddingsWrapper
    
    wrapped_llm = LangchainLLMWrapper(critic_llm)
    wrapped_embeddings = LangchainEmbeddingsWrapper(embed_model)
  3. Pass the wrapped LLM and Embeddings to the evaluate function:

    from ragas.metrics import context_precision, context_recall
    from ragas import evaluate
    from datasets import Dataset
    
    data_samples = {
       'question': ['When was the first super bowl?', 'Who won the most super bowls?'],
       'answer': ['The first superbowl was held on January 15, 1967', 'The most super bowls have been won by The New England Patriots'],
       'contexts' : [['The Super Bowl....season since 1966,','replacing the NFL...in February.'], 
       ['The Green Bay Packers...Green Bay, Wisconsin.','The Packers compete...Football Conference']],
       'ground_truth': ['The first superbowl was held on January 15, 1967', 'The New England Patriots have won the Super Bowl a record six times']
    }
    dataset = Dataset.from_dict(data_samples)
    
    result_context_precision = evaluate(dataset, metrics=[context_precision], llm=wrapped_llm, embeddings=wrapped_embeddings)
    result_context_recall = evaluate(dataset, metrics=[context_recall], llm=wrapped_llm, embeddings=wrapped_embeddings)
    
    results = result_context_precision | result_context_recall

By following these steps, you ensure that your custom LLM and Embeddings are correctly wrapped and used within the evaluate function. The LangchainLLMWrapper class in the ragas library is designed to adapt Langchain's BaseLanguageModel interface, implementing methods for both synchronous and asynchronous text generation, which should resolve the errors you are encountering [1][2][3].

To continue talking to Dosu, mention @dosu.

I tried following this suggestion for my code with Ollama as the LLM with langchain, but getting timeout error Below is my code

from datasets import Dataset 
from ragas.metrics import faithfulness, answer_correctness
from ragas.run_config import RunConfig
from ragas import evaluate
from ragas.llms import LangchainLLMWrapper
from ragas.embeddings import LangchainEmbeddingsWrapper

from langchain.llms import Ollama
from langchain.embeddings import HuggingFaceEmbeddings

data_samples = {
    'question': ['When was the first super bowl?', 'Who won the most super bowls?'],
    'answer': ['The first superbowl was held on Jan 15, 1967', 'The most super bowls have been won by The New England Patriots'],
    'ground_truth': ['The first superbowl was held on January 15, 1967', 'The New England Patriots have won the Super Bowl a record six times']
}

embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
llm = Ollama(model = "llama3")
dataset = Dataset.from_dict(data_samples)

llm_wrapper = LangchainLLMWrapper(llm)
embeddings_wrapper = LangchainEmbeddingsWrapper(embeddings)
score = evaluate(dataset,llm=llm_wrapper,embeddings=embeddings_wrapper,metrics=[answer_correctness])

with above code, I am getting below error

TimeoutError Traceback (most recent call last) ----> 1 score = evaluate(dataset,llm=llm_wrapper,embeddings=embeddings_wrapper,metrics=[answer_correctness])

File ~\AppData\Local\anaconda3\envs\env_llm\lib\site-packages\ragas\evaluation.py:247, in evaluate(dataset, metrics, llm, embeddings, callbacks, in_ci, run_config, raise_exceptions, column_map) 244 if not evaluation_group_cm.ended: 245 evaluation_rm.on_chain_error(e) --> 247 raise e 248 else: 249 result = Result( 250 scores=Dataset.from_list(scores), 251 dataset=dataset, 252 binary_columns=binary_metrics, 253 )

File ~\AppData\Local\anaconda3\envs\env_llm\lib\site-packages\ragas\evaluation.py:227, in evaluate(dataset, metrics, llm, embeddings, callbacks, in_ci, run_config, raise_exceptions, column_map) 224 scores = [] 225 try: 226 # get the results --> 227 results = executor.results() 228 if results == []: 229 raise ExceptionInRunner()

File ~\AppData\Local\anaconda3\envs\env_llm\lib\site-packages\ragas\executor.py:107, in Executor.results(self) 103 results.append(r) 105 return results --> 107 results = asyncio.run(_aresults()) 108 sorted_results = sorted(results, key=lambda x: x[0]) 109 return [r[1] for r in sorted_results]

File ~\AppData\Local\anaconda3\envs\env_llm\lib\site-packages\nest_asyncio.py:31, in _patch_asyncio..run(main, debug) 29 task = asyncio.ensure_future(main) 30 try: ---> 31 return loop.run_until_complete(task) 32 finally: 33 if not task.done():

File ~\AppData\Local\anaconda3\envs\env_llm\lib\site-packages\nest_asyncio.py:99, in _patch_loop..run_until_complete(self, future) 96 if not f.done(): 97 raise RuntimeError( 98 'Event loop stopped before Future completed.') ---> 99 return f.result()

File ~\AppData\Local\anaconda3\envs\env_llm\lib\asyncio\futures.py:178, in Future.result(self) 176 self.__log_traceback = False 177 if self._exception is not None: --> 178 raise self._exception 179 return self._result

File ~\AppData\Local\anaconda3\envs\env_llm\lib\asyncio\tasks.py:280, in Task.step(failed resolving arguments) 276 try: 277 if exc is None: 278 # We use the send method directly, because coroutines 279 # don't have `iterandnext__` methods. --> 280 result = coro.send(None) 281 else: 282 result = coro.throw(exc)

File ~\AppData\Local\anaconda3\envs\env_llm\lib\site-packages\ragas\executor.py:102, in Executor.results.._aresults() 94 results = [] 95 for future in tqdm( 96 futures_as_they_finish, 97 desc=self.desc, (...) 100 leave=self.keep_progress_bar, 101 ): --> 102 r = await future 103 results.append(r) 105 return results

File ~\AppData\Local\anaconda3\envs\env_llm\lib\asyncio\tasks.py:619, in as_completed.._wait_for_one() 616 if f is None: 617 # Dummy value from _on_timeout(). 618 raise exceptions.TimeoutError --> 619 return f.result()

File ~\AppData\Local\anaconda3\envs\env_llm\lib\asyncio\futures.py:178, in Future.result(self) 176 self.__log_traceback = False 177 if self._exception is not None: --> 178 raise self._exception 179 return self._result

File ~\AppData\Local\anaconda3\envs\env_llm\lib\asyncio\tasks.py:280, in Task.step(failed resolving arguments) 276 try: 277 if exc is None: 278 # We use the send method directly, because coroutines 279 # don't have `iterandnext__` methods. --> 280 result = coro.send(None) 281 else: 282 result = coro.throw(exc)

File ~\AppData\Local\anaconda3\envs\env_llm\lib\site-packages\ragas\executor.py:34, in as_completed..sema_coro(coro) 32 async def sema_coro(coro): 33 async with semaphore: ---> 34 return await coro

File ~\AppData\Local\anaconda3\envs\env_llm\lib\site-packages\ragas\executor.py:59, in Executor.wrap_callable_with_index..wrapped_callable_async(*args, **kwargs) 57 except Exception as e: 58 if self.raise_exceptions: ---> 59 raise e 60 else: 61 logger.error( 62 "Runner in Executor raised an exception", exc_info=False 63 )

File ~\AppData\Local\anaconda3\envs\env_llm\lib\site-packages\ragas\executor.py:53, in Executor.wrap_callable_with_index..wrapped_callable_async(*args, *kwargs) 51 result = np.nan 52 try: ---> 53 result = await callable(args, **kwargs) 54 except MaxRetriesExceeded as e: 55 # this only for testset generation v2 56 logger.warning(f"max retries exceeded for {e.evolution}")

File ~\AppData\Local\anaconda3\envs\env_llm\lib\site-packages\ragas\metrics\base.py:127, in Metric.ascore(self, row, callbacks, thread_timeout) 125 if not group_cm.ended: 126 rm.on_chain_error(e) --> 127 raise e 128 else: 129 if not group_cm.ended:

File ~\AppData\Local\anaconda3\envs\env_llm\lib\site-packages\ragas\metrics\base.py:120, in Metric.ascore(self, row, callbacks, thread_timeout) 118 rm, group_cm = new_group(self.name, inputs=row, callbacks=callbacks) 119 try: --> 120 score = await asyncio.wait_for( 121 self._ascore(row=row, callbacks=group_cm), 122 timeout=thread_timeout, 123 ) 124 except Exception as e: 125 if not group_cm.ended:

File ~\AppData\Local\anaconda3\envs\env_llm\lib\asyncio\tasks.py:501, in wait_for(fut, timeout, loop) 497 # We must ensure that the task is not running 498 # after wait_for() returns. 499 # See https://bugs.python.org/issue32751 500 await _cancel_and_wait(fut, loop=loop) --> 501 raise exceptions.TimeoutError() 502 finally: 503 timeout_handle.cancel()

TimeoutError:

jjmachan commented 1 month ago

@SalwaMostafa timeout error should have been fixed in the latest version. this doc could also help Max Workers, Timeouts, Retries and more with RunConfig | Ragas

but this could be an ollama <> ragas integration issue too. I will check that