KruxAI / ragbuilder

A toolkit to create optimal Production-ready RAG setup for your data
https://docs.ragbuilder.io
Apache License 2.0
1.15k stars 101 forks source link

RAGAS error #70

Open ibagur opened 6 days ago

ibagur commented 6 days ago

I am running ragbuilder on MacOS 14.6.1, built from the Docker image and run as a container. I have followed the 'Getting started' example using the provided blogpost and also a small set of selected parameters for a quicker run. I have tried with both OpenAI models (gpt3.5-turbo and gpt-4o-mini) but also with local Ollama models. At some point, after embeddings get created and synthetic data starts being generated, the script always rises an error before completing the process, apparently related to a RAGAS issue.

I am not too familiar with RAGAS framework, so I have not tried to debug it myself for the moment. I paste here the error log, in case any one can shed some light:

Generating:  10%|#         | 2/20 [00:43<06:05, 20.28s/it]
2024-10-16 18:39:32 [INFO] 2024-10-16 16:39:32 - common.py - INFO:     192.168.65.1:43819 - "GET / HTTP/1.1" 200 OK
2024-10-16 18:39:32 [INFO] 2024-10-16 16:39:32 - common.py - INFO:     192.168.65.1:43819 - "GET /static/ragbuilder_light.png HTTP/1.1" 200 OK
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py - 
Generating:  10%|#         | 2/20 [00:48<07:14, 24.14s/it]
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py - Exception in thread Thread-4:
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py - Traceback (most recent call last):
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -   File "/usr/local/lib/python3.12/threading.py", line 1073, in _bootstrap_inner
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -     self.run()
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -   File "/usr/local/lib/python3.12/site-packages/ragas/executor.py", line 96, in run
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -     results = self.loop.run_until_complete(self._aresults())
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -   File "uvloop/loop.pyx", line 1517, in uvloop.loop.Loop.run_until_complete
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -   File "/usr/local/lib/python3.12/site-packages/ragas/executor.py", line 84, in _aresults
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -     raise e
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -   File "/usr/local/lib/python3.12/site-packages/ragas/executor.py", line 79, in _aresults
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -     r = await future
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -         ^^^^^^^^^^^^
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -   File "/usr/local/lib/python3.12/asyncio/tasks.py", line 631, in _wait_for_one
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -     return f.result()  # May raise f.exception().
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -            ^^^^^^^^^^
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -   File "/usr/local/lib/python3.12/site-packages/ragas/executor.py", line 38, in sema_coro
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -     return await coro
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -            ^^^^^^^^^^
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -   File "/usr/local/lib/python3.12/site-packages/ragas/executor.py", line 112, in wrapped_callable_async
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -     return counter, await callable(*args, **kwargs)
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -   File "/usr/local/lib/python3.12/site-packages/ragas/testset/evolutions.py", line 144, in evolve
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -     return await self.generate_datarow(
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -   File "/usr/local/lib/python3.12/site-packages/ragas/testset/evolutions.py", line 213, in generate_datarow
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -     if i - 1 < len(current_nodes.nodes)
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -        ~~^~~
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py - TypeError: unsupported operand type(s) for -: 'str' and 'int'
2024-10-16 18:39:36 [ERROR] 2024-10-16 16:39:36 - ragbuilder.py - Synthetic test data generation failed: The runner thread which was running the jobs raised an exeception. Read the traceback above to debug it. You can also pass `raise_exceptions=False` incase you want to show only a warning message instead.
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py - INFO:     192.168.65.1:48009 - "POST /rbuilder HTTP/1.1" 500 Internal Server Error
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py - ERROR:    Exception in ASGI application
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py - Traceback (most recent call last):
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -   File "/usr/local/lib/python3.12/site-packages/uvicorn/protocols/http/httptools_impl.py", line 399, in run_asgi
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -     result = await app(  # type: ignore[func-returns-value]
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -   File "/usr/local/lib/python3.12/site-packages/uvicorn/middleware/proxy_headers.py", line 70, in __call__
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -     return await self.app(scope, receive, send)
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -   File "/usr/local/lib/python3.12/site-packages/fastapi/applications.py", line 1054, in __call__
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -     await super().__call__(scope, receive, send)
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -   File "/usr/local/lib/python3.12/site-packages/starlette/applications.py", line 113, in __call__
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -     await self.middleware_stack(scope, receive, send)
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -   File "/usr/local/lib/python3.12/site-packages/starlette/middleware/errors.py", line 187, in __call__
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -     raise exc
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -   File "/usr/local/lib/python3.12/site-packages/starlette/middleware/errors.py", line 165, in __call__
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -     await self.app(scope, receive, _send)
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -   File "/usr/local/lib/python3.12/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -   File "/usr/local/lib/python3.12/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -     raise exc
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -   File "/usr/local/lib/python3.12/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -     await app(scope, receive, sender)
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -   File "/usr/local/lib/python3.12/site-packages/starlette/routing.py", line 715, in __call__
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -     await self.middleware_stack(scope, receive, send)
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -   File "/usr/local/lib/python3.12/site-packages/starlette/routing.py", line 735, in app
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -     await route.handle(scope, receive, send)
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -   File "/usr/local/lib/python3.12/site-packages/starlette/routing.py", line 288, in handle
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -     await self.app(scope, receive, send)
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -   File "/usr/local/lib/python3.12/site-packages/starlette/routing.py", line 76, in app
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -     await wrap_app_handling_exceptions(app, request)(scope, receive, send)
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -   File "/usr/local/lib/python3.12/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -     raise exc
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -   File "/usr/local/lib/python3.12/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -     await app(scope, receive, sender)
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -   File "/usr/local/lib/python3.12/site-packages/starlette/routing.py", line 73, in app
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -     response = await f(request)
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -                ^^^^^^^^^^^^^^^^
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -   File "/usr/local/lib/python3.12/site-packages/fastapi/routing.py", line 301, in app
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -     raw_response = await run_endpoint_function(
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -   File "/usr/local/lib/python3.12/site-packages/fastapi/routing.py", line 214, in run_endpoint_function
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -     return await run_in_threadpool(dependant.call, **values)
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -   File "/usr/local/lib/python3.12/site-packages/starlette/concurrency.py", line 39, in run_in_threadpool
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -     return await anyio.to_thread.run_sync(func, *args)
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -   File "/usr/local/lib/python3.12/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -     return await get_async_backend().run_sync_in_worker_thread(
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -   File "/usr/local/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 2405, in run_sync_in_worker_thread
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -     return await future
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -            ^^^^^^^^^^^^
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -   File "/usr/local/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 914, in run
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -     result = context.run(func, *args)
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -              ^^^^^^^^^^^^^^^^^^^^^^^^
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -   File "/usr/local/lib/python3.12/site-packages/ragbuilder/ragbuilder.py", line 475, in rbuilder_route
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -     result = parse_config(project_data.model_dump(), db)
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -   File "/usr/local/lib/python3.12/site-packages/ragbuilder/ragbuilder.py", line 599, in parse_config
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -     f_name=generate_data.generate_data(
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -   File "/usr/local/lib/python3.12/site-packages/ragbuilder/generate_data.py", line 94, in generate_data
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -     testset = generator.generate_with_langchain_docs(
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -   File "/usr/local/lib/python3.12/site-packages/ragas/testset/generator.py", line 179, in generate_with_langchain_docs
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -     return self.generate(
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -            ^^^^^^^^^^^^^^
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -   File "/usr/local/lib/python3.12/site-packages/ragas/testset/generator.py", line 274, in generate
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py -     raise ExceptionInRunner()
2024-10-16 18:39:36 [INFO] 2024-10-16 16:39:36 - common.py - ragas.exceptions.ExceptionInRunner: The runner thread which was running the jobs raised an exeception. Read the traceback above to debug it. You can also pass `raise_exceptions=False` incase you want to show only a warning message instead.
aravind10x commented 5 days ago

Hi @ibagur can you please share the snippet from the log where it prints the config? I'd like to try and reproduce this error by selecting the same options as you are.

ibagur commented 18 hours ago

Hi @aravind10x , I rebuilt the Docker image and run another test. The synthetic data was generated, but I got several errors when running the whole process. This time I kept the whole log, which I paste here. I did not use a very complex set of settings, and only 10 runs of Bayesian optimization.


            [INFO] 2024-10-22 15:13:22 - common.py - USER_AGENT environment variable not set, consider setting it to identify your requests.
[INFO] 2024-10-22 15:13:24 - common.py - USER_AGENT environment variable not set, consider setting it to identify your requests.
[INFO] 2024-10-22 15:13:26 - common.py - WARNING: faiss must be imported for indexing
[INFO] 2024-10-22 15:13:29 - ragbuilder.py - LOG_FILENAME = /ragbuilder/logs/2024-10-22_15-13-22.log
[INFO] 2024-10-22 15:13:29 - common.py - INFO:     Started server process [1]
[INFO] 2024-10-22 15:13:29 - common.py - INFO:     Waiting for application startup.
[INFO] 2024-10-22 15:13:29 - common.py - INFO:     Application startup complete.
[INFO] 2024-10-22 15:13:29 - common.py - INFO:     Uvicorn running on http://0.0.0.0:8005 (Press CTRL+C to quit)
[INFO] 2024-10-22 15:13:40 - common.py - INFO:     192.168.65.1:35238 - "GET / HTTP/1.1" 200 OK
[INFO] 2024-10-22 15:13:40 - common.py - INFO:     192.168.65.1:35238 - "GET /static/main.js?v=0.0.18 HTTP/1.1" 200 OK
[INFO] 2024-10-22 15:13:41 - common.py - INFO:     192.168.65.1:35238 - "GET /static/ragbuilder_light.png HTTP/1.1" 304 Not Modified
[INFO] 2024-10-22 15:13:44 - common.py - INFO:     192.168.65.1:35238 - "GET / HTTP/1.1" 200 OK
[INFO] 2024-10-22 15:13:59 - ragbuilder.py - Source data  validity: False
[INFO] 2024-10-22 15:13:59 - common.py - INFO:     192.168.65.1:43206 - "POST /check_source_data HTTP/1.1" 200 OK
[INFO] 2024-10-22 15:20:59 - ragbuilder.py - Source data  validity: False
[INFO] 2024-10-22 15:20:59 - common.py - INFO:     192.168.65.1:53411 - "POST /check_source_data HTTP/1.1" 200 OK
[INFO] 2024-10-22 15:30:19 - ragbuilder.py - Source data  validity: False
[INFO] 2024-10-22 15:30:19 - common.py - INFO:     192.168.65.1:65531 - "POST /check_source_data HTTP/1.1" 200 OK
[INFO] 2024-10-22 15:30:41 - ragbuilder.py - Source data https://lilianweng.github.io/posts/2023-06-23-agent/ validity: True
[INFO] 2024-10-22 15:30:41 - ragbuilder.py - Estimating source data size...
[INFO] 2024-10-22 15:30:41 - ragbuilder.py - Source data size: 40809, exceeds_threshold: False
[INFO] 2024-10-22 15:30:41 - common.py - INFO:     192.168.65.1:19142 - "POST /check_source_data HTTP/1.1" 200 OK
[INFO] 2024-10-22 15:30:56 - loader.py - classify_path Invoked
[INFO] 2024-10-22 15:30:56 - common.py - INFO:     192.168.65.1:59707 - "POST /check_test_data HTTP/1.1" 200 OK
[INFO] 2024-10-22 15:33:21 - ragbuilder.py - enable_analytics = True
[INFO] 2024-10-22 15:33:22 - ragbuilder.py - Initiating parsing config: {'description': 'QA Chatbot', 'sourceData': 'https://lilianweng.github.io/posts/2023-06-23-agent/', 'useSampling': False, 'compareTemplates': False, 'includeNonTemplated': True, 'selectedTemplates': [], 'chunkingStrategy': {'MarkdownHeaderTextSplitter': False, 'HTMLHeaderTextSplitter': True, 'SemanticChunker': True, 'RecursiveCharacterTextSplitter': False, 'CharacterTextSplitter': False}, 'chunkSize': {'min': 500, 'max': 2000}, 'embeddingModel': {'OpenAI:text-embedding-3-small': False, 'OpenAI:text-embedding-3-large': True, 'OpenAI:text-embedding-ada-002': False, 'HuggingFace': False, 'AzureOAI': False, 'GoogleVertexAI': False, 'Ollama': False}, 'huggingfaceEmbeddingModel': '', 'azureOAIEmbeddingModel': '', 'googleVertexAIEmbeddingModel': '', 'ollamaEmbeddingModel': '', 'vectorDB': 'chromaDB', 'retriever': {'vectorSimilarity': True, 'vectorMMR': True, 'bm25Retriever': False, 'colbertRetriever': False, 'multiQuery': False, 'parentDocFullDoc': False, 'parentDocLargeChunk': False}, 'topK': {'search_k_5': True, 'search_k_10': False, 'search_k_20': False}, 'llm': {'OpenAI:gpt-4o-mini': True, 'OpenAI:gpt-4o': False, 'OpenAI:gpt-3.5-turbo': False, 'OpenAI:gpt-4-turbo': False, 'HuggingFace': False, 'Groq': False, 'AzureOAI': False, 'GoogleVertexAI': False, 'Ollama': False}, 'huggingfaceLLMModel': '', 'groqLLMModel': '', 'azureOAILLMModel': '', 'googleVertexAILLMModel': '', 'ollamaLLMModel': '', 'generateSyntheticData': True, 'optimization': 'bayesianOptimization', 'evalFramework': 'RAGAS', 'evalEmbedding': 'OpenAI:text-embedding-3-large', 'evalLLM': 'OpenAI:gpt-4o-mini', 'sotaEmbeddingModel': None, 'sotaLLMModel': None, 'compressors': {'mixedbread-ai/mxbai-rerank-base-v1': False, 'mixedbread-ai/mxbai-rerank-large-v1': False, 'BAAI/bge-reranker-base': False, 'flashrank': False, 'cohere': False, 'jina': False, 'colbert': False, 'rankllm': True, 'LongContextReorder': False, 'EmbeddingsRedundantFilter': False, 'EmbeddingsClusteringFilter': False, 'LLMChainFilter': False}, 'syntheticDataGeneration': {'testSize': '20', 'criticLLM': 'OpenAI:gpt-4o-mini', 'generatorLLM': 'OpenAI:gpt-4o-mini', 'generatorEmbedding': 'OpenAI:text-embedding-3-large'}, 'testDataPath': None, 'existingSynthDataPath': None, 'testSize': None, 'criticLLM': None, 'generatorLLM': None, 'generatorEmbedding': None, 'numRuns': '10', 'nJobs': 1, 'dataProcessors': []}
[INFO] 2024-10-22 15:33:22 - ragbuilder.py - Disabled options: ['useSampling', 'compareTemplates', 'mixedbread-ai/mxbai-rerank-base-v1', 'mixedbread-ai/mxbai-rerank-large-v1', 'BAAI/bge-reranker-base', 'flashrank', 'cohere', 'jina', 'colbert', 'LongContextReorder', 'EmbeddingsRedundantFilter', 'EmbeddingsClusteringFilter', 'LLMChainFilter', 'OpenAI:gpt-4o', 'OpenAI:gpt-3.5-turbo', 'OpenAI:gpt-4-turbo', 'HuggingFace', 'Groq', 'AzureOAI', 'GoogleVertexAI', 'Ollama', 'search_k_10', 'search_k_20', 'bm25Retriever', 'colbertRetriever', 'multiQuery', 'parentDocFullDoc', 'parentDocLargeChunk', 'OpenAI:text-embedding-3-small', 'OpenAI:text-embedding-ada-002', 'HuggingFace', 'AzureOAI', 'GoogleVertexAI', 'Ollama', 'MarkdownHeaderTextSplitter', 'RecursiveCharacterTextSplitter', 'CharacterTextSplitter']
[INFO] 2024-10-22 15:33:22 - sampler.py - Sampling not required for https://lilianweng.github.io/posts/2023-06-23-agent/
[INFO] 2024-10-22 15:33:22 - ragbuilder.py - No data processors selected. Using original data.
[INFO] 2024-10-22 15:33:22 - llmConfig.py - LLM Invoked
[INFO] 2024-10-22 15:33:22 - llmConfig.py - LLM Code Gen Invoked: OpenAI:gpt-4o-mini
[INFO] 2024-10-22 15:33:22 - llmConfig.py - LLM Invoked
[INFO] 2024-10-22 15:33:22 - llmConfig.py - LLM Code Gen Invoked: OpenAI:gpt-4o-mini
[INFO] 2024-10-22 15:33:22 - embedding.py - getEmbedding Invoked
[INFO] 2024-10-22 15:33:22 - embedding.py - OpenAIEmbeddings Invoked: OpenAI:text-embedding-3-large
[INFO] 2024-10-22 15:33:22 - generate_data.py - Loading docs...
[INFO] 2024-10-22 15:33:22 - loader.py - ragbuilder_loader Invoked:{'input_path': 'https://lilianweng.github.io/posts/2023-06-23-agent/', 'return_code': False}
[INFO] 2024-10-22 15:33:22 - loader.py - classify_path Invoked
[INFO] 2024-10-22 15:33:22 - loader.py - Source type identified: url
[INFO] 2024-10-22 15:33:22 - loader.py - ragbuilder_url_loader_exec Invoked
[INFO] 2024-10-22 15:33:22 - generate_data.py - Completed loading docs
[INFO] 2024-10-22 15:33:26 - generate_data.py - Initiating synthetic data generation...
[INFO] 2024-10-22 15:33:26 - common.py - 
embedding nodes:   0%|          | 0/22 [00:00<?, ?it/s]
[INFO] 2024-10-22 15:33:26 - common.py - 
embedding nodes:   5%|4         | 1/22 [00:00<00:17,  1.22it/s]
[INFO] 2024-10-22 15:33:27 - common.py - 
embedding nodes:  14%|#3        | 3/22 [00:01<00:05,  3.22it/s]
[INFO] 2024-10-22 15:33:27 - common.py - 
embedding nodes:  18%|#8        | 4/22 [00:01<00:04,  4.05it/s]
[INFO] 2024-10-22 15:33:27 - common.py - 
embedding nodes:  23%|##2       | 5/22 [00:01<00:03,  5.00it/s]
[INFO] 2024-10-22 15:33:27 - common.py - 
embedding nodes:  27%|##7       | 6/22 [00:01<00:02,  5.87it/s]
[INFO] 2024-10-22 15:33:27 - common.py - 
embedding nodes:  41%|####      | 9/22 [00:01<00:01, 10.09it/s]
[INFO] 2024-10-22 15:33:27 - common.py - 
embedding nodes:  68%|######8   | 15/22 [00:01<00:00, 20.40it/s]
[INFO] 2024-10-22 15:33:27 - common.py - 
embedding nodes:  82%|########1 | 18/22 [00:01<00:00, 21.32it/s]
[INFO] 2024-10-22 15:33:28 - common.py - 
embedding nodes:  95%|#########5| 21/22 [00:02<00:00, 15.02it/s]
[INFO] 2024-10-22 15:33:28 - common.py - 
Generating:   0%|          | 0/20 [00:00<?, ?it/s]
[INFO] 2024-10-22 15:33:45 - common.py - 
Generating:   5%|5         | 1/20 [00:17<05:31, 17.45s/it]
[INFO] 2024-10-22 15:34:04 - common.py - 
Generating:  10%|#         | 2/20 [00:35<05:23, 17.97s/it]
[INFO] 2024-10-22 15:34:26 - common.py - 
Generating:  15%|#5        | 3/20 [00:58<05:38, 19.91s/it]
[INFO] 2024-10-22 15:34:33 - common.py - 
Generating:  20%|##        | 4/20 [01:05<04:00, 15.00s/it]
[INFO] 2024-10-22 15:34:44 - common.py - 
Generating:  25%|##5       | 5/20 [01:16<03:21, 13.41s/it]
[INFO] 2024-10-22 15:35:26 - common.py - 
Generating:  30%|###       | 6/20 [01:57<05:23, 23.08s/it]
[INFO] 2024-10-22 15:35:39 - common.py - 
Generating:  35%|###5      | 7/20 [02:10<04:17, 19.80s/it]
[INFO] 2024-10-22 15:38:16 - common.py - 
Generating:  40%|####      | 8/20 [04:48<12:43, 63.59s/it]
[INFO] 2024-10-22 15:38:25 - common.py - 
Generating:  45%|####5     | 9/20 [04:57<08:31, 46.51s/it]
[INFO] 2024-10-22 15:38:34 - common.py - 
Generating:  50%|#####     | 10/20 [05:06<05:50, 35.01s/it]
[INFO] 2024-10-22 15:38:43 - common.py - 
Generating:  55%|#####5    | 11/20 [05:14<04:01, 26.85s/it]
[INFO] 2024-10-22 15:38:52 - common.py - 
Generating:  60%|######    | 12/20 [05:24<02:52, 21.58s/it]
[INFO] 2024-10-22 15:39:11 - common.py - 
Generating:  65%|######5   | 13/20 [05:43<02:25, 20.85s/it]
[INFO] 2024-10-22 15:39:20 - common.py - 
Generating:  70%|#######   | 14/20 [05:51<01:42, 17.07s/it]
[INFO] 2024-10-22 15:39:26 - common.py - 
Generating:  75%|#######5  | 15/20 [05:57<01:08, 13.74s/it]
[INFO] 2024-10-22 15:39:33 - common.py - 
Generating:  80%|########  | 16/20 [06:05<00:47, 11.89s/it]
[INFO] 2024-10-22 15:39:41 - common.py - 
Generating:  85%|########5 | 17/20 [06:12<00:31, 10.47s/it]
[INFO] 2024-10-22 15:39:47 - common.py - 
Generating:  90%|######### | 18/20 [06:19<00:18,  9.26s/it]
[INFO] 2024-10-22 15:40:31 - common.py - 
Generating:  95%|#########5| 19/20 [07:03<00:19, 19.70s/it]
[INFO] 2024-10-22 15:40:43 - common.py - 
Generating: 100%|##########| 20/20 [07:14<00:00, 17.32s/it]
[INFO] 2024-10-22 15:40:43 - common.py - 
Generating: 100%|##########| 20/20 [07:14<00:00, 21.75s/it]
[INFO] 2024-10-22 15:40:43 - generate_data.py - Completed synthetic data generation
[INFO] 2024-10-22 15:40:43 - generate_data.py - Writing to csv file: rag_test_data_1729611643.948248.csv
[INFO] 2024-10-22 15:40:43 - loader.py - classify_path Invoked
[INFO] 2024-10-22 15:40:44 - ragbuilder.py - Saving hashmap for synthetic data: rag_test_data_1729611643.948248.csv ...
[INFO] 2024-10-22 15:40:44 - ragbuilder.py - Saved hashmap for synthetic data: rag_test_data_1729611643.948248.csv
[INFO] 2024-10-22 15:40:44 - ragbuilder.py - Synthetic test data generation completed: rag_test_data_1729611643.948248.csv
[INFO] 2024-10-22 15:40:44 - ragbuilder.py - Saving run config details in db...
[INFO] 2024-10-22 15:40:44 - ragbuilder.py - Saved run config details in db
[INFO] 2024-10-22 15:40:44 - ragbuilder.py - Using Bayesian optimization to find optimal RAG configs...
[INFO] 2024-10-22 15:40:44 - executor.py - other_embedding=[]
[INFO] 2024-10-22 15:40:44 - executor.py - other_llm=[]
[INFO] 2024-10-22 15:40:44 - executor.py - Initializing RAG parameter set...
[INFO] 2024-10-22 15:40:44 - llmConfig.py - LLM Invoked
[INFO] 2024-10-22 15:40:44 - llmConfig.py - LLM Code Gen Invoked: OpenAI:gpt-4o-mini
[INFO] 2024-10-22 15:40:44 - embedding.py - getEmbedding Invoked
[INFO] 2024-10-22 15:40:44 - embedding.py - OpenAIEmbeddings Invoked: OpenAI:text-embedding-3-large
[INFO] 2024-10-22 15:40:44 - executor.py - Number of RAG combinations : 260
[INFO] 2024-10-22 15:40:44 - executor.py - Running Bayesian optimization...
[INFO] 2024-10-22 15:40:44 - common.py - /usr/local/lib/python3.12/site-packages/ragbuilder/executor.py:316: ExperimentalWarning: RetryFailedTrialCallback is experimental (supported from v2.8.0). The interface can change in the future.
[INFO] 2024-10-22 15:40:44 - common.py -   failed_trial_callback=optuna.storages.RetryFailedTrialCallback(max_retry=3),
[INFO] 2024-10-22 15:40:44 - common.py - /usr/local/lib/python3.12/site-packages/optuna/study/_optimize.py:186: ExperimentalWarning: fail_stale_trials is experimental (supported from v2.9.0). The interface can change in the future.
[INFO] 2024-10-22 15:40:44 - common.py -   optuna.storages.fail_stale_trials(study)
[INFO] 2024-10-22 15:40:44 - langchain_templates.py - n_retrievers: 2
[INFO] 2024-10-22 15:40:44 - executor.py - Config raw={'framework': 'langchain', 'chunking_kwargs': {'chunk_strategy': 'RecursiveCharacterTextSplitter', 'chunk_size': 500, 'chunk_overlap': 200}, 'vectorDB_kwargs': {'vectorDB': 'chromaDB'}, 'embedding_kwargs': {'embedding_model': 'OpenAI:text-embedding-3-large'}, 'retriever_kwargs': {'retrievers': [{'retriever_type': 'vectorSimilarity', 'search_type': 'similarity', 'search_kwargs': '5'}, {'retriever_type': 'bm25Retriever', 'search_type': 'similarity', 'search_kwargs': '5'}, {'retriever_type': 'parentDocFullDoc', 'search_type': 'similarity', 'search_kwargs': '5'}], 'contextual_compression_retriever': True, 'document_compressor_pipeline': ['rankllm']}, 'retrieval_model': 'OpenAI:gpt-4o-mini', 'compressors': ['rankllm'], 'loader_kwargs': {'source': 'url', 'input_path': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}, 'run_id': 1729611202}

[INFO] 2024-10-22 15:40:44 - executor.py - Running: 1/10
[INFO] 2024-10-22 15:40:44 - executor.py - Initializing RAG object...
[INFO] 2024-10-22 15:40:44 - getCode.py - Generating code...
[INFO] 2024-10-22 15:40:44 - llmConfig.py - LLM Invoked
[INFO] 2024-10-22 15:40:44 - llmConfig.py - LLM Code Gen Invoked: OpenAI:gpt-4o-mini
[INFO] 2024-10-22 15:40:44 - loader.py - ragbuilder_loader Invoked:{'framework': 'langchain', 'retrieval_model': 'OpenAI:gpt-4o-mini', 'source_ids': [1], 'loader_kwargs': {'source': 'url', 'input_path': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}, 'chunking_kwargs': {'chunk_strategy': 'RecursiveCharacterTextSplitter', 'chunk_size': 500, 'chunk_overlap': 200}, 'vectorDB_kwargs': {'vectorDB': 'chromaDB'}, 'embedding_kwargs': {'embedding_model': 'OpenAI:text-embedding-3-large'}, 'retriever_kwargs': {'retrievers': [{'retriever_type': 'vectorSimilarity', 'search_type': 'similarity', 'search_kwargs': '5'}, {'retriever_type': 'bm25Retriever', 'search_type': 'similarity', 'search_kwargs': '5'}, {'retriever_type': 'parentDocFullDoc', 'search_type': 'similarity', 'search_kwargs': '5'}], 'contextual_compression_retriever': True, 'document_compressor_pipeline': ['rankllm']}, 'input_path': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}
[INFO] 2024-10-22 15:40:44 - loader.py - classify_path Invoked
[INFO] 2024-10-22 15:40:44 - loader.py - Source type identified: url
[INFO] 2024-10-22 15:40:44 - loader.py - ragbuilder_url_loader Invoked
[INFO] 2024-10-22 15:40:44 - embedding.py - getEmbedding Invoked
[INFO] 2024-10-22 15:40:44 - embedding.py - OpenAIEmbeddings Invoked: OpenAI:text-embedding-3-large
[INFO] 2024-10-22 15:40:44 - langchain_chunking.py - RecursiveCharacterTextSplitter Invoked
[INFO] 2024-10-22 15:40:44 - vectordb.py - Chroma DB Loaded
[INFO] 2024-10-22 15:40:44 - vectordb.py - Chroma DB Index Created testindex-ragbuilder-1729611645206
[INFO] 2024-10-22 15:40:44 - common.py - {'code_string': "c=Chroma.from_documents(documents=splits, embedding=embedding, collection_name='testindex-ragbuilder-1729611645206', client_settings=chromadb.config.Settings(allow_reset=True))", 'import_string': 'from langchain_chroma import Chroma\nimport chromadb'}
[INFO] 2024-10-22 15:40:44 - retriever.py - Vector Retriever Invoked
[INFO] 2024-10-22 15:40:44 - retriever.py - BM25Retriever Retriever Invoked
[INFO] 2024-10-22 15:40:44 - retriever.py - Parent Document (Full) Retriever Invoked
[INFO] 2024-10-22 15:40:44 - getCode.py - Codegen completed
[INFO] 2024-10-22 15:40:44 - executor.py - Creating RAG object from generated code...(this may take a while in some cases)
[INFO] 2024-10-22 15:40:48 - common.py - Loading default rankllm model for language en
[INFO] 2024-10-22 15:40:48 - common.py - Default Model: gpt-4o
[INFO] 2024-10-22 15:40:48 - common.py - Loading RankLLMRanker model gpt-4o
[INFO] 2024-10-22 15:40:48 - common.py - You don't have the necessary dependencies installed to use RankLLMRanker.
[INFO] 2024-10-22 15:40:48 - common.py - Please install the necessary dependencies for RankLLMRanker by running `pip install "rerankers[rankllm]"` or `pip install "rerankers[all]" to install the dependencies for all reranker types.
[INFO] 2024-10-22 15:40:48 - common.py - An error occurred: 'NoneType' object has no attribute 'as_langchain_compressor'
[INFO] 2024-10-22 15:40:48 - before_sleep.py - Retrying ragbuilder.executor._exec in 2.531438118438982 seconds as it returned None.
[INFO] 2024-10-22 15:40:54 - common.py - Loading default rankllm model for language en
[INFO] 2024-10-22 15:40:54 - common.py - Default Model: gpt-4o
[INFO] 2024-10-22 15:40:54 - common.py - Loading RankLLMRanker model gpt-4o
[INFO] 2024-10-22 15:40:54 - common.py - You don't have the necessary dependencies installed to use RankLLMRanker.
[INFO] 2024-10-22 15:40:54 - common.py - Please install the necessary dependencies for RankLLMRanker by running `pip install "rerankers[rankllm]"` or `pip install "rerankers[all]" to install the dependencies for all reranker types.
[INFO] 2024-10-22 15:40:54 - common.py - An error occurred: 'NoneType' object has no attribute 'as_langchain_compressor'
[INFO] 2024-10-22 15:40:54 - before_sleep.py - Retrying ragbuilder.executor._exec in 0.307984489730841 seconds as it returned None.
[INFO] 2024-10-22 15:40:58 - common.py - Loading default rankllm model for language en
[INFO] 2024-10-22 15:40:58 - common.py - Default Model: gpt-4o
[INFO] 2024-10-22 15:40:58 - common.py - Loading RankLLMRanker model gpt-4o
[INFO] 2024-10-22 15:40:58 - common.py - You don't have the necessary dependencies installed to use RankLLMRanker.
[INFO] 2024-10-22 15:40:58 - common.py - Please install the necessary dependencies for RankLLMRanker by running `pip install "rerankers[rankllm]"` or `pip install "rerankers[all]" to install the dependencies for all reranker types.
[INFO] 2024-10-22 15:40:58 - common.py - An error occurred: 'NoneType' object has no attribute 'as_langchain_compressor'
[ERROR] 2024-10-22 15:40:58 - executor.py - Error creating RAG object from generated code. ERROR: RetryError[<Future at 0xfffedd789160 state=finished returned NoneType>]
[ERROR] 2024-10-22 15:40:58 - executor.py - Error while evaluating config: {'framework': 'langchain', 'chunking_kwargs': {'chunk_strategy': 'RecursiveCharacterTextSplitter', 'chunk_size': 500, 'chunk_overlap': 200}, 'vectorDB_kwargs': {'vectorDB': 'chromaDB'}, 'embedding_kwargs': {'embedding_model': 'OpenAI:text-embedding-3-large'}, 'retriever_kwargs': {'retrievers': [{'retriever_type': 'vectorSimilarity', 'search_type': 'similarity', 'search_kwargs': '5'}, {'retriever_type': 'bm25Retriever', 'search_type': 'similarity', 'search_kwargs': '5'}, {'retriever_type': 'parentDocFullDoc', 'search_type': 'similarity', 'search_kwargs': '5'}], 'contextual_compression_retriever': True, 'document_compressor_pipeline': ['rankllm']}, 'retrieval_model': 'OpenAI:gpt-4o-mini', 'compressors': ['rankllm'], 'loader_kwargs': {'source': 'url', 'input_path': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}, 'run_id': 1729611202}
[ERROR] 2024-10-22 15:40:58 - executor.py - Error: Error creating RAG object from generated code. ERROR: RetryError[<Future at 0xfffedd789160 state=finished returned NoneType>]
[INFO] 2024-10-22 15:40:58 - common.py - /usr/local/lib/python3.12/site-packages/optuna/study/_optimize.py:186: ExperimentalWarning: fail_stale_trials is experimental (supported from v2.9.0). The interface can change in the future.
[INFO] 2024-10-22 15:40:58 - common.py -   optuna.storages.fail_stale_trials(study)
[INFO] 2024-10-22 15:40:58 - langchain_templates.py - n_retrievers: 2
[INFO] 2024-10-22 15:40:58 - executor.py - Config raw={'framework': 'langchain', 'chunking_kwargs': {'chunk_strategy': 'RecursiveCharacterTextSplitter', 'chunk_size': 500, 'chunk_overlap': 200}, 'vectorDB_kwargs': {'vectorDB': 'chromaDB'}, 'embedding_kwargs': {'embedding_model': 'OpenAI:text-embedding-3-large'}, 'retriever_kwargs': {'retrievers': [{'retriever_type': 'vectorSimilarity', 'search_type': 'similarity', 'search_kwargs': '5'}, {'retriever_type': 'bm25Retriever', 'search_type': 'similarity', 'search_kwargs': '5'}, {'retriever_type': 'multiQuery', 'search_type': 'similarity', 'search_kwargs': '5'}, {'retriever_type': 'parentDocFullDoc', 'search_type': 'similarity', 'search_kwargs': '5'}], 'contextual_compression_retriever': True, 'document_compressor_pipeline': ['rankllm']}, 'retrieval_model': 'OpenAI:gpt-4o-mini', 'compressors': ['rankllm'], 'loader_kwargs': {'source': 'url', 'input_path': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}, 'run_id': 1729611202}

[INFO] 2024-10-22 15:40:58 - executor.py - Running: 2/10
[INFO] 2024-10-22 15:40:58 - executor.py - Initializing RAG object...
[INFO] 2024-10-22 15:40:58 - getCode.py - Generating code...
[INFO] 2024-10-22 15:40:58 - llmConfig.py - LLM Invoked
[INFO] 2024-10-22 15:40:58 - llmConfig.py - LLM Code Gen Invoked: OpenAI:gpt-4o-mini
[INFO] 2024-10-22 15:40:58 - loader.py - ragbuilder_loader Invoked:{'framework': 'langchain', 'retrieval_model': 'OpenAI:gpt-4o-mini', 'source_ids': [1], 'loader_kwargs': {'source': 'url', 'input_path': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}, 'chunking_kwargs': {'chunk_strategy': 'RecursiveCharacterTextSplitter', 'chunk_size': 500, 'chunk_overlap': 200}, 'vectorDB_kwargs': {'vectorDB': 'chromaDB'}, 'embedding_kwargs': {'embedding_model': 'OpenAI:text-embedding-3-large'}, 'retriever_kwargs': {'retrievers': [{'retriever_type': 'vectorSimilarity', 'search_type': 'similarity', 'search_kwargs': '5'}, {'retriever_type': 'bm25Retriever', 'search_type': 'similarity', 'search_kwargs': '5'}, {'retriever_type': 'multiQuery', 'search_type': 'similarity', 'search_kwargs': '5'}, {'retriever_type': 'parentDocFullDoc', 'search_type': 'similarity', 'search_kwargs': '5'}], 'contextual_compression_retriever': True, 'document_compressor_pipeline': ['rankllm']}, 'input_path': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}
[INFO] 2024-10-22 15:40:58 - loader.py - classify_path Invoked
[INFO] 2024-10-22 15:40:58 - loader.py - Source type identified: url
[INFO] 2024-10-22 15:40:58 - loader.py - ragbuilder_url_loader Invoked
[INFO] 2024-10-22 15:40:58 - embedding.py - getEmbedding Invoked
[INFO] 2024-10-22 15:40:58 - embedding.py - OpenAIEmbeddings Invoked: OpenAI:text-embedding-3-large
[INFO] 2024-10-22 15:40:58 - langchain_chunking.py - RecursiveCharacterTextSplitter Invoked
[INFO] 2024-10-22 15:40:58 - vectordb.py - Chroma DB Loaded
[INFO] 2024-10-22 15:40:58 - vectordb.py - Chroma DB Index Created testindex-ragbuilder-1729611659343
[INFO] 2024-10-22 15:40:58 - common.py - {'code_string': "c=Chroma.from_documents(documents=splits, embedding=embedding, collection_name='testindex-ragbuilder-1729611659343', client_settings=chromadb.config.Settings(allow_reset=True))", 'import_string': 'from langchain_chroma import Chroma\nimport chromadb'}
[INFO] 2024-10-22 15:40:58 - retriever.py - Vector Retriever Invoked
[INFO] 2024-10-22 15:40:58 - retriever.py - BM25Retriever Retriever Invoked
[INFO] 2024-10-22 15:40:58 - retriever.py - Multi Query Retriever Invoked
[INFO] 2024-10-22 15:40:58 - retriever.py - Parent Document (Full) Retriever Invoked
[INFO] 2024-10-22 15:40:58 - getCode.py - Codegen completed
[INFO] 2024-10-22 15:40:58 - executor.py - Creating RAG object from generated code...(this may take a while in some cases)
[INFO] 2024-10-22 15:41:02 - common.py - Loading default rankllm model for language en
[INFO] 2024-10-22 15:41:02 - common.py - Default Model: gpt-4o
[INFO] 2024-10-22 15:41:02 - common.py - Loading RankLLMRanker model gpt-4o
[INFO] 2024-10-22 15:41:02 - common.py - You don't have the necessary dependencies installed to use RankLLMRanker.
[INFO] 2024-10-22 15:41:02 - common.py - Please install the necessary dependencies for RankLLMRanker by running `pip install "rerankers[rankllm]"` or `pip install "rerankers[all]" to install the dependencies for all reranker types.
[INFO] 2024-10-22 15:41:02 - common.py - An error occurred: 'NoneType' object has no attribute 'as_langchain_compressor'
[INFO] 2024-10-22 15:41:02 - before_sleep.py - Retrying ragbuilder.executor._exec in 0.5744388272846028 seconds as it returned None.
[INFO] 2024-10-22 15:41:08 - common.py - Loading default rankllm model for language en
[INFO] 2024-10-22 15:41:08 - common.py - Default Model: gpt-4o
[INFO] 2024-10-22 15:41:08 - common.py - Loading RankLLMRanker model gpt-4o
[INFO] 2024-10-22 15:41:08 - common.py - You don't have the necessary dependencies installed to use RankLLMRanker.
[INFO] 2024-10-22 15:41:08 - common.py - Please install the necessary dependencies for RankLLMRanker by running `pip install "rerankers[rankllm]"` or `pip install "rerankers[all]" to install the dependencies for all reranker types.
[INFO] 2024-10-22 15:41:08 - common.py - An error occurred: 'NoneType' object has no attribute 'as_langchain_compressor'
[INFO] 2024-10-22 15:41:08 - before_sleep.py - Retrying ragbuilder.executor._exec in 1.8482921471814078 seconds as it returned None.
[INFO] 2024-10-22 15:41:18 - common.py - Loading default rankllm model for language en
[INFO] 2024-10-22 15:41:18 - common.py - Default Model: gpt-4o
[INFO] 2024-10-22 15:41:18 - common.py - Loading RankLLMRanker model gpt-4o
[INFO] 2024-10-22 15:41:18 - common.py - You don't have the necessary dependencies installed to use RankLLMRanker.
[INFO] 2024-10-22 15:41:18 - common.py - Please install the necessary dependencies for RankLLMRanker by running `pip install "rerankers[rankllm]"` or `pip install "rerankers[all]" to install the dependencies for all reranker types.
[INFO] 2024-10-22 15:41:18 - common.py - An error occurred: 'NoneType' object has no attribute 'as_langchain_compressor'
[ERROR] 2024-10-22 15:41:18 - executor.py - Error creating RAG object from generated code. ERROR: RetryError[<Future at 0xfffee484c200 state=finished returned NoneType>]
[ERROR] 2024-10-22 15:41:18 - executor.py - Error while evaluating config: {'framework': 'langchain', 'chunking_kwargs': {'chunk_strategy': 'RecursiveCharacterTextSplitter', 'chunk_size': 500, 'chunk_overlap': 200}, 'vectorDB_kwargs': {'vectorDB': 'chromaDB'}, 'embedding_kwargs': {'embedding_model': 'OpenAI:text-embedding-3-large'}, 'retriever_kwargs': {'retrievers': [{'retriever_type': 'vectorSimilarity', 'search_type': 'similarity', 'search_kwargs': '5'}, {'retriever_type': 'bm25Retriever', 'search_type': 'similarity', 'search_kwargs': '5'}, {'retriever_type': 'multiQuery', 'search_type': 'similarity', 'search_kwargs': '5'}, {'retriever_type': 'parentDocFullDoc', 'search_type': 'similarity', 'search_kwargs': '5'}], 'contextual_compression_retriever': True, 'document_compressor_pipeline': ['rankllm']}, 'retrieval_model': 'OpenAI:gpt-4o-mini', 'compressors': ['rankllm'], 'loader_kwargs': {'source': 'url', 'input_path': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}, 'run_id': 1729611202}
[ERROR] 2024-10-22 15:41:18 - executor.py - Error: Error creating RAG object from generated code. ERROR: RetryError[<Future at 0xfffee484c200 state=finished returned NoneType>]
[INFO] 2024-10-22 15:41:18 - common.py - /usr/local/lib/python3.12/site-packages/optuna/study/_optimize.py:186: ExperimentalWarning: fail_stale_trials is experimental (supported from v2.9.0). The interface can change in the future.
[INFO] 2024-10-22 15:41:18 - common.py -   optuna.storages.fail_stale_trials(study)
[INFO] 2024-10-22 15:41:18 - langchain_templates.py - n_retrievers: 1
[INFO] 2024-10-22 15:41:18 - executor.py - Config raw={'framework': 'langchain', 'chunking_kwargs': {'chunk_strategy': 'RecursiveCharacterTextSplitter', 'chunk_size': 500, 'chunk_overlap': 200}, 'vectorDB_kwargs': {'vectorDB': 'chromaDB'}, 'embedding_kwargs': {'embedding_model': 'OpenAI:text-embedding-3-large'}, 'retriever_kwargs': {'retrievers': [{'retriever_type': 'vectorSimilarity', 'search_type': 'similarity', 'search_kwargs': '5'}, {'retriever_type': 'bm25Retriever', 'search_type': 'similarity', 'search_kwargs': '5'}, {'retriever_type': 'parentDocFullDoc', 'search_type': 'similarity', 'search_kwargs': '5'}], 'contextual_compression_retriever': True, 'document_compressor_pipeline': ['rankllm']}, 'retrieval_model': 'OpenAI:gpt-4o-mini', 'compressors': ['rankllm'], 'loader_kwargs': {'source': 'url', 'input_path': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}, 'run_id': 1729611202}

[INFO] 2024-10-22 15:41:18 - executor.py - Running: 3/10
[INFO] 2024-10-22 15:41:18 - executor.py - Initializing RAG object...
[INFO] 2024-10-22 15:41:18 - getCode.py - Generating code...
[INFO] 2024-10-22 15:41:18 - llmConfig.py - LLM Invoked
[INFO] 2024-10-22 15:41:18 - llmConfig.py - LLM Code Gen Invoked: OpenAI:gpt-4o-mini
[INFO] 2024-10-22 15:41:18 - loader.py - ragbuilder_loader Invoked:{'framework': 'langchain', 'retrieval_model': 'OpenAI:gpt-4o-mini', 'source_ids': [1], 'loader_kwargs': {'source': 'url', 'input_path': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}, 'chunking_kwargs': {'chunk_strategy': 'RecursiveCharacterTextSplitter', 'chunk_size': 500, 'chunk_overlap': 200}, 'vectorDB_kwargs': {'vectorDB': 'chromaDB'}, 'embedding_kwargs': {'embedding_model': 'OpenAI:text-embedding-3-large'}, 'retriever_kwargs': {'retrievers': [{'retriever_type': 'vectorSimilarity', 'search_type': 'similarity', 'search_kwargs': '5'}, {'retriever_type': 'bm25Retriever', 'search_type': 'similarity', 'search_kwargs': '5'}, {'retriever_type': 'parentDocFullDoc', 'search_type': 'similarity', 'search_kwargs': '5'}], 'contextual_compression_retriever': True, 'document_compressor_pipeline': ['rankllm']}, 'input_path': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}
[INFO] 2024-10-22 15:41:18 - loader.py - classify_path Invoked
[INFO] 2024-10-22 15:41:18 - loader.py - Source type identified: url
[INFO] 2024-10-22 15:41:18 - loader.py - ragbuilder_url_loader Invoked
[INFO] 2024-10-22 15:41:18 - embedding.py - getEmbedding Invoked
[INFO] 2024-10-22 15:41:18 - embedding.py - OpenAIEmbeddings Invoked: OpenAI:text-embedding-3-large
[INFO] 2024-10-22 15:41:18 - langchain_chunking.py - RecursiveCharacterTextSplitter Invoked
[INFO] 2024-10-22 15:41:18 - vectordb.py - Chroma DB Loaded
[INFO] 2024-10-22 15:41:18 - vectordb.py - Chroma DB Index Created testindex-ragbuilder-1729611678953
[INFO] 2024-10-22 15:41:18 - common.py - {'code_string': "c=Chroma.from_documents(documents=splits, embedding=embedding, collection_name='testindex-ragbuilder-1729611678953', client_settings=chromadb.config.Settings(allow_reset=True))", 'import_string': 'from langchain_chroma import Chroma\nimport chromadb'}
[INFO] 2024-10-22 15:41:18 - retriever.py - Vector Retriever Invoked
[INFO] 2024-10-22 15:41:18 - retriever.py - BM25Retriever Retriever Invoked
[INFO] 2024-10-22 15:41:18 - retriever.py - Parent Document (Full) Retriever Invoked
[INFO] 2024-10-22 15:41:18 - getCode.py - Codegen completed
[INFO] 2024-10-22 15:41:18 - executor.py - Creating RAG object from generated code...(this may take a while in some cases)
[INFO] 2024-10-22 15:41:23 - common.py - Loading default rankllm model for language en
[INFO] 2024-10-22 15:41:23 - common.py - Default Model: gpt-4o
[INFO] 2024-10-22 15:41:23 - common.py - Loading RankLLMRanker model gpt-4o
[INFO] 2024-10-22 15:41:23 - common.py - You don't have the necessary dependencies installed to use RankLLMRanker.
[INFO] 2024-10-22 15:41:23 - common.py - Please install the necessary dependencies for RankLLMRanker by running `pip install "rerankers[rankllm]"` or `pip install "rerankers[all]" to install the dependencies for all reranker types.
[INFO] 2024-10-22 15:41:23 - common.py - An error occurred: 'NoneType' object has no attribute 'as_langchain_compressor'
[INFO] 2024-10-22 15:41:23 - before_sleep.py - Retrying ragbuilder.executor._exec in 1.9810705552750885 seconds as it returned None.
[INFO] 2024-10-22 15:41:28 - common.py - Loading default rankllm model for language en
[INFO] 2024-10-22 15:41:28 - common.py - Default Model: gpt-4o
[INFO] 2024-10-22 15:41:28 - common.py - Loading RankLLMRanker model gpt-4o
[INFO] 2024-10-22 15:41:28 - common.py - You don't have the necessary dependencies installed to use RankLLMRanker.
[INFO] 2024-10-22 15:41:28 - common.py - Please install the necessary dependencies for RankLLMRanker by running `pip install "rerankers[rankllm]"` or `pip install "rerankers[all]" to install the dependencies for all reranker types.
[INFO] 2024-10-22 15:41:28 - common.py - An error occurred: 'NoneType' object has no attribute 'as_langchain_compressor'
[INFO] 2024-10-22 15:41:28 - before_sleep.py - Retrying ragbuilder.executor._exec in 1.7067707367500415 seconds as it returned None.
[INFO] 2024-10-22 15:41:33 - common.py - Loading default rankllm model for language en
[INFO] 2024-10-22 15:41:33 - common.py - Default Model: gpt-4o
[INFO] 2024-10-22 15:41:33 - common.py - Loading RankLLMRanker model gpt-4o
[INFO] 2024-10-22 15:41:33 - common.py - You don't have the necessary dependencies installed to use RankLLMRanker.
[INFO] 2024-10-22 15:41:33 - common.py - Please install the necessary dependencies for RankLLMRanker by running `pip install "rerankers[rankllm]"` or `pip install "rerankers[all]" to install the dependencies for all reranker types.
[INFO] 2024-10-22 15:41:33 - common.py - An error occurred: 'NoneType' object has no attribute 'as_langchain_compressor'
[ERROR] 2024-10-22 15:41:33 - executor.py - Error creating RAG object from generated code. ERROR: RetryError[<Future at 0xfffee4823260 state=finished returned NoneType>]
[ERROR] 2024-10-22 15:41:33 - executor.py - Error while evaluating config: {'framework': 'langchain', 'chunking_kwargs': {'chunk_strategy': 'RecursiveCharacterTextSplitter', 'chunk_size': 500, 'chunk_overlap': 200}, 'vectorDB_kwargs': {'vectorDB': 'chromaDB'}, 'embedding_kwargs': {'embedding_model': 'OpenAI:text-embedding-3-large'}, 'retriever_kwargs': {'retrievers': [{'retriever_type': 'vectorSimilarity', 'search_type': 'similarity', 'search_kwargs': '5'}, {'retriever_type': 'bm25Retriever', 'search_type': 'similarity', 'search_kwargs': '5'}, {'retriever_type': 'parentDocFullDoc', 'search_type': 'similarity', 'search_kwargs': '5'}], 'contextual_compression_retriever': True, 'document_compressor_pipeline': ['rankllm']}, 'retrieval_model': 'OpenAI:gpt-4o-mini', 'compressors': ['rankllm'], 'loader_kwargs': {'source': 'url', 'input_path': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}, 'run_id': 1729611202}
[ERROR] 2024-10-22 15:41:33 - executor.py - Error: Error creating RAG object from generated code. ERROR: RetryError[<Future at 0xfffee4823260 state=finished returned NoneType>]
[INFO] 2024-10-22 15:41:33 - common.py - /usr/local/lib/python3.12/site-packages/optuna/study/_optimize.py:186: ExperimentalWarning: fail_stale_trials is experimental (supported from v2.9.0). The interface can change in the future.
[INFO] 2024-10-22 15:41:33 - common.py -   optuna.storages.fail_stale_trials(study)
[INFO] 2024-10-22 15:41:33 - langchain_templates.py - n_retrievers: 0
[INFO] 2024-10-22 15:41:33 - executor.py - Config raw={'framework': 'langchain', 'chunking_kwargs': {'chunk_strategy': 'HTMLHeaderTextSplitter'}, 'vectorDB_kwargs': {'vectorDB': 'chromaDB'}, 'embedding_kwargs': {'embedding_model': 'OpenAI:text-embedding-3-large'}, 'retriever_kwargs': {'retrievers': [{'retriever_type': 'vectorSimilarity', 'search_type': 'similarity', 'search_kwargs': '5'}, {'retriever_type': 'bm25Retriever', 'search_type': 'similarity', 'search_kwargs': '5'}], 'contextual_compression_retriever': True, 'document_compressor_pipeline': ['rankllm']}, 'retrieval_model': 'OpenAI:gpt-4o-mini', 'compressors': ['rankllm'], 'loader_kwargs': {'source': 'url', 'input_path': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}, 'run_id': 1729611202}

[INFO] 2024-10-22 15:41:33 - executor.py - Running: 4/10
[INFO] 2024-10-22 15:41:33 - executor.py - Initializing RAG object...
[INFO] 2024-10-22 15:41:33 - getCode.py - Generating code...
[INFO] 2024-10-22 15:41:33 - llmConfig.py - LLM Invoked
[INFO] 2024-10-22 15:41:33 - llmConfig.py - LLM Code Gen Invoked: OpenAI:gpt-4o-mini
[INFO] 2024-10-22 15:41:33 - loader.py - ragbuilder_loader Invoked:{'framework': 'langchain', 'retrieval_model': 'OpenAI:gpt-4o-mini', 'source_ids': [1], 'loader_kwargs': {'source': 'url', 'input_path': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}, 'chunking_kwargs': {'chunk_strategy': 'HTMLHeaderTextSplitter'}, 'vectorDB_kwargs': {'vectorDB': 'chromaDB'}, 'embedding_kwargs': {'embedding_model': 'OpenAI:text-embedding-3-large'}, 'retriever_kwargs': {'retrievers': [{'retriever_type': 'vectorSimilarity', 'search_type': 'similarity', 'search_kwargs': '5'}, {'retriever_type': 'bm25Retriever', 'search_type': 'similarity', 'search_kwargs': '5'}], 'contextual_compression_retriever': True, 'document_compressor_pipeline': ['rankllm']}, 'input_path': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}
[INFO] 2024-10-22 15:41:33 - loader.py - classify_path Invoked
[INFO] 2024-10-22 15:41:33 - loader.py - Source type identified: url
[INFO] 2024-10-22 15:41:33 - loader.py - ragbuilder_url_loader Invoked
[INFO] 2024-10-22 15:41:33 - embedding.py - getEmbedding Invoked
[INFO] 2024-10-22 15:41:33 - embedding.py - OpenAIEmbeddings Invoked: OpenAI:text-embedding-3-large
[INFO] 2024-10-22 15:41:33 - langchain_chunking.py - HTMLHeaderTextSplitter Invoked
[INFO] 2024-10-22 15:41:33 - vectordb.py - Chroma DB Loaded
[INFO] 2024-10-22 15:41:33 - vectordb.py - Chroma DB Index Created testindex-ragbuilder-1729611694204
[INFO] 2024-10-22 15:41:33 - common.py - {'code_string': "c=Chroma.from_documents(documents=splits, embedding=embedding, collection_name='testindex-ragbuilder-1729611694204', client_settings=chromadb.config.Settings(allow_reset=True))", 'import_string': 'from langchain_chroma import Chroma\nimport chromadb'}
[INFO] 2024-10-22 15:41:33 - retriever.py - Vector Retriever Invoked
[INFO] 2024-10-22 15:41:33 - retriever.py - BM25Retriever Retriever Invoked
[INFO] 2024-10-22 15:41:33 - getCode.py - Codegen completed
[INFO] 2024-10-22 15:41:33 - executor.py - Creating RAG object from generated code...(this may take a while in some cases)
[INFO] 2024-10-22 15:41:35 - common.py - Loading default rankllm model for language en
[INFO] 2024-10-22 15:41:35 - common.py - Default Model: gpt-4o
[INFO] 2024-10-22 15:41:35 - common.py - Loading RankLLMRanker model gpt-4o
[INFO] 2024-10-22 15:41:35 - common.py - You don't have the necessary dependencies installed to use RankLLMRanker.
[INFO] 2024-10-22 15:41:35 - common.py - Please install the necessary dependencies for RankLLMRanker by running `pip install "rerankers[rankllm]"` or `pip install "rerankers[all]" to install the dependencies for all reranker types.
[INFO] 2024-10-22 15:41:35 - common.py - An error occurred: 'NoneType' object has no attribute 'as_langchain_compressor'
[INFO] 2024-10-22 15:41:35 - before_sleep.py - Retrying ragbuilder.executor._exec in 0.34100230601077897 seconds as it returned None.
[INFO] 2024-10-22 15:41:37 - common.py - Loading default rankllm model for language en
[INFO] 2024-10-22 15:41:37 - common.py - Default Model: gpt-4o
[INFO] 2024-10-22 15:41:37 - common.py - Loading RankLLMRanker model gpt-4o
[INFO] 2024-10-22 15:41:37 - common.py - You don't have the necessary dependencies installed to use RankLLMRanker.
[INFO] 2024-10-22 15:41:37 - common.py - Please install the necessary dependencies for RankLLMRanker by running `pip install "rerankers[rankllm]"` or `pip install "rerankers[all]" to install the dependencies for all reranker types.
[INFO] 2024-10-22 15:41:37 - common.py - An error occurred: 'NoneType' object has no attribute 'as_langchain_compressor'
[INFO] 2024-10-22 15:41:37 - before_sleep.py - Retrying ragbuilder.executor._exec in 2.1210324890735675 seconds as it returned None.
[INFO] 2024-10-22 15:41:40 - common.py - Loading default rankllm model for language en
[INFO] 2024-10-22 15:41:40 - common.py - Default Model: gpt-4o
[INFO] 2024-10-22 15:41:40 - common.py - Loading RankLLMRanker model gpt-4o
[INFO] 2024-10-22 15:41:40 - common.py - You don't have the necessary dependencies installed to use RankLLMRanker.
[INFO] 2024-10-22 15:41:40 - common.py - Please install the necessary dependencies for RankLLMRanker by running `pip install "rerankers[rankllm]"` or `pip install "rerankers[all]" to install the dependencies for all reranker types.
[INFO] 2024-10-22 15:41:40 - common.py - An error occurred: 'NoneType' object has no attribute 'as_langchain_compressor'
[ERROR] 2024-10-22 15:41:40 - executor.py - Error creating RAG object from generated code. ERROR: RetryError[<Future at 0xfffedcafaf90 state=finished returned NoneType>]
[ERROR] 2024-10-22 15:41:40 - executor.py - Error while evaluating config: {'framework': 'langchain', 'chunking_kwargs': {'chunk_strategy': 'HTMLHeaderTextSplitter'}, 'vectorDB_kwargs': {'vectorDB': 'chromaDB'}, 'embedding_kwargs': {'embedding_model': 'OpenAI:text-embedding-3-large'}, 'retriever_kwargs': {'retrievers': [{'retriever_type': 'vectorSimilarity', 'search_type': 'similarity', 'search_kwargs': '5'}, {'retriever_type': 'bm25Retriever', 'search_type': 'similarity', 'search_kwargs': '5'}], 'contextual_compression_retriever': True, 'document_compressor_pipeline': ['rankllm']}, 'retrieval_model': 'OpenAI:gpt-4o-mini', 'compressors': ['rankllm'], 'loader_kwargs': {'source': 'url', 'input_path': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}, 'run_id': 1729611202}
[ERROR] 2024-10-22 15:41:40 - executor.py - Error: Error creating RAG object from generated code. ERROR: RetryError[<Future at 0xfffedcafaf90 state=finished returned NoneType>]
[INFO] 2024-10-22 15:41:40 - common.py - /usr/local/lib/python3.12/site-packages/optuna/study/_optimize.py:186: ExperimentalWarning: fail_stale_trials is experimental (supported from v2.9.0). The interface can change in the future.
[INFO] 2024-10-22 15:41:40 - common.py -   optuna.storages.fail_stale_trials(study)
[INFO] 2024-10-22 15:41:40 - langchain_templates.py - n_retrievers: 2
[INFO] 2024-10-22 15:41:40 - executor.py - Config raw={'framework': 'langchain', 'chunking_kwargs': {'chunk_strategy': 'RecursiveCharacterTextSplitter', 'chunk_size': 500, 'chunk_overlap': 200}, 'vectorDB_kwargs': {'vectorDB': 'chromaDB'}, 'embedding_kwargs': {'embedding_model': 'OpenAI:text-embedding-3-large'}, 'retriever_kwargs': {'retrievers': [{'retriever_type': 'vectorSimilarity', 'search_type': 'similarity', 'search_kwargs': '5'}, {'retriever_type': 'bm25Retriever', 'search_type': 'similarity', 'search_kwargs': '5'}, {'retriever_type': 'multiQuery', 'search_type': 'similarity', 'search_kwargs': '5'}, {'retriever_type': 'parentDocLargeChunk', 'search_type': 'similarity', 'search_kwargs': '5'}], 'contextual_compression_retriever': True, 'document_compressor_pipeline': ['rankllm']}, 'retrieval_model': 'OpenAI:gpt-4o-mini', 'compressors': ['rankllm'], 'loader_kwargs': {'source': 'url', 'input_path': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}, 'run_id': 1729611202}

[INFO] 2024-10-22 15:41:40 - executor.py - Running: 5/10
[INFO] 2024-10-22 15:41:40 - executor.py - Initializing RAG object...
[INFO] 2024-10-22 15:41:40 - getCode.py - Generating code...
[INFO] 2024-10-22 15:41:40 - llmConfig.py - LLM Invoked
[INFO] 2024-10-22 15:41:40 - llmConfig.py - LLM Code Gen Invoked: OpenAI:gpt-4o-mini
[INFO] 2024-10-22 15:41:40 - loader.py - ragbuilder_loader Invoked:{'framework': 'langchain', 'retrieval_model': 'OpenAI:gpt-4o-mini', 'source_ids': [1], 'loader_kwargs': {'source': 'url', 'input_path': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}, 'chunking_kwargs': {'chunk_strategy': 'RecursiveCharacterTextSplitter', 'chunk_size': 500, 'chunk_overlap': 200}, 'vectorDB_kwargs': {'vectorDB': 'chromaDB'}, 'embedding_kwargs': {'embedding_model': 'OpenAI:text-embedding-3-large'}, 'retriever_kwargs': {'retrievers': [{'retriever_type': 'vectorSimilarity', 'search_type': 'similarity', 'search_kwargs': '5'}, {'retriever_type': 'bm25Retriever', 'search_type': 'similarity', 'search_kwargs': '5'}, {'retriever_type': 'multiQuery', 'search_type': 'similarity', 'search_kwargs': '5'}, {'retriever_type': 'parentDocLargeChunk', 'search_type': 'similarity', 'search_kwargs': '5'}], 'contextual_compression_retriever': True, 'document_compressor_pipeline': ['rankllm']}, 'input_path': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}
[INFO] 2024-10-22 15:41:40 - loader.py - classify_path Invoked
[INFO] 2024-10-22 15:41:40 - loader.py - Source type identified: url
[INFO] 2024-10-22 15:41:40 - loader.py - ragbuilder_url_loader Invoked
[INFO] 2024-10-22 15:41:40 - embedding.py - getEmbedding Invoked
[INFO] 2024-10-22 15:41:40 - embedding.py - OpenAIEmbeddings Invoked: OpenAI:text-embedding-3-large
[INFO] 2024-10-22 15:41:40 - langchain_chunking.py - RecursiveCharacterTextSplitter Invoked
[INFO] 2024-10-22 15:41:40 - vectordb.py - Chroma DB Loaded
[INFO] 2024-10-22 15:41:40 - vectordb.py - Chroma DB Index Created testindex-ragbuilder-1729611701087
[INFO] 2024-10-22 15:41:40 - common.py - {'code_string': "c=Chroma.from_documents(documents=splits, embedding=embedding, collection_name='testindex-ragbuilder-1729611701087', client_settings=chromadb.config.Settings(allow_reset=True))", 'import_string': 'from langchain_chroma import Chroma\nimport chromadb'}
[INFO] 2024-10-22 15:41:40 - retriever.py - Vector Retriever Invoked
[INFO] 2024-10-22 15:41:40 - retriever.py - BM25Retriever Retriever Invoked
[INFO] 2024-10-22 15:41:40 - retriever.py - Multi Query Retriever Invoked
[INFO] 2024-10-22 15:41:40 - retriever.py - Parent Document (Large Chunk) Retriever Invoked
[INFO] 2024-10-22 15:41:40 - langchain_chunking.py - RecursiveCharacterTextSplitter Invoked
[INFO] 2024-10-22 15:41:40 - getCode.py - Codegen completed
[INFO] 2024-10-22 15:41:40 - executor.py - Creating RAG object from generated code...(this may take a while in some cases)
[INFO] 2024-10-22 15:41:45 - common.py - Loading default rankllm model for language en
[INFO] 2024-10-22 15:41:45 - common.py - Default Model: gpt-4o
[INFO] 2024-10-22 15:41:45 - common.py - Loading RankLLMRanker model gpt-4o
[INFO] 2024-10-22 15:41:45 - common.py - You don't have the necessary dependencies installed to use RankLLMRanker.
[INFO] 2024-10-22 15:41:45 - common.py - Please install the necessary dependencies for RankLLMRanker by running `pip install "rerankers[rankllm]"` or `pip install "rerankers[all]" to install the dependencies for all reranker types.
[INFO] 2024-10-22 15:41:45 - common.py - An error occurred: 'NoneType' object has no attribute 'as_langchain_compressor'
[INFO] 2024-10-22 15:41:45 - before_sleep.py - Retrying ragbuilder.executor._exec in 1.8707802068605432 seconds as it returned None.
[INFO] 2024-10-22 15:41:53 - common.py - Loading default rankllm model for language en
[INFO] 2024-10-22 15:41:53 - common.py - Default Model: gpt-4o
[INFO] 2024-10-22 15:41:53 - common.py - Loading RankLLMRanker model gpt-4o
[INFO] 2024-10-22 15:41:53 - common.py - You don't have the necessary dependencies installed to use RankLLMRanker.
[INFO] 2024-10-22 15:41:53 - common.py - Please install the necessary dependencies for RankLLMRanker by running `pip install "rerankers[rankllm]"` or `pip install "rerankers[all]" to install the dependencies for all reranker types.
[INFO] 2024-10-22 15:41:53 - common.py - An error occurred: 'NoneType' object has no attribute 'as_langchain_compressor'
[INFO] 2024-10-22 15:41:53 - before_sleep.py - Retrying ragbuilder.executor._exec in 0.36517258772630923 seconds as it returned None.
[INFO] 2024-10-22 15:41:56 - common.py - Loading default rankllm model for language en
[INFO] 2024-10-22 15:41:56 - common.py - Default Model: gpt-4o
[INFO] 2024-10-22 15:41:56 - common.py - Loading RankLLMRanker model gpt-4o
[INFO] 2024-10-22 15:41:56 - common.py - You don't have the necessary dependencies installed to use RankLLMRanker.
[INFO] 2024-10-22 15:41:56 - common.py - Please install the necessary dependencies for RankLLMRanker by running `pip install "rerankers[rankllm]"` or `pip install "rerankers[all]" to install the dependencies for all reranker types.
[INFO] 2024-10-22 15:41:56 - common.py - An error occurred: 'NoneType' object has no attribute 'as_langchain_compressor'
[ERROR] 2024-10-22 15:41:56 - executor.py - Error creating RAG object from generated code. ERROR: RetryError[<Future at 0xfffedca75eb0 state=finished returned NoneType>]
[ERROR] 2024-10-22 15:41:56 - executor.py - Error while evaluating config: {'framework': 'langchain', 'chunking_kwargs': {'chunk_strategy': 'RecursiveCharacterTextSplitter', 'chunk_size': 1500, 'chunk_overlap': 600}, 'vectorDB_kwargs': {'vectorDB': 'chromaDB'}, 'embedding_kwargs': {'embedding_model': 'OpenAI:text-embedding-3-large'}, 'retriever_kwargs': {'retrievers': [{'retriever_type': 'vectorSimilarity', 'search_type': 'similarity', 'search_kwargs': '5'}, {'retriever_type': 'bm25Retriever', 'search_type': 'similarity', 'search_kwargs': '5'}, {'retriever_type': 'multiQuery', 'search_type': 'similarity', 'search_kwargs': '5'}, {'retriever_type': 'parentDocLargeChunk', 'search_type': 'similarity', 'search_kwargs': '5'}], 'contextual_compression_retriever': True, 'document_compressor_pipeline': ['rankllm']}, 'retrieval_model': 'OpenAI:gpt-4o-mini', 'compressors': ['rankllm'], 'loader_kwargs': {'source': 'url', 'input_path': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}, 'run_id': 1729611202}
[ERROR] 2024-10-22 15:41:56 - executor.py - Error: Error creating RAG object from generated code. ERROR: RetryError[<Future at 0xfffedca75eb0 state=finished returned NoneType>]
[INFO] 2024-10-22 15:41:56 - common.py - /usr/local/lib/python3.12/site-packages/optuna/study/_optimize.py:186: ExperimentalWarning: fail_stale_trials is experimental (supported from v2.9.0). The interface can change in the future.
[INFO] 2024-10-22 15:41:56 - common.py -   optuna.storages.fail_stale_trials(study)
[INFO] 2024-10-22 15:41:56 - langchain_templates.py - n_retrievers: 1
[INFO] 2024-10-22 15:41:56 - executor.py - Config raw={'framework': 'langchain', 'chunking_kwargs': {'chunk_strategy': 'SemanticChunker'}, 'vectorDB_kwargs': {'vectorDB': 'chromaDB'}, 'embedding_kwargs': {'embedding_model': 'OpenAI:text-embedding-3-large'}, 'retriever_kwargs': {'retrievers': [{'retriever_type': 'vectorSimilarity', 'search_type': 'similarity', 'search_kwargs': '5'}, {'retriever_type': 'bm25Retriever', 'search_type': 'similarity', 'search_kwargs': '5'}, {'retriever_type': 'colbertRetriever', 'search_type': 'similarity', 'search_kwargs': '5'}], 'contextual_compression_retriever': True, 'document_compressor_pipeline': ['rankllm']}, 'retrieval_model': 'OpenAI:gpt-4o-mini', 'compressors': ['rankllm'], 'loader_kwargs': {'source': 'url', 'input_path': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}, 'run_id': 1729611202}

[INFO] 2024-10-22 15:41:56 - executor.py - Running: 6/10
[INFO] 2024-10-22 15:41:56 - executor.py - Initializing RAG object...
[INFO] 2024-10-22 15:41:56 - getCode.py - Generating code...
[INFO] 2024-10-22 15:41:56 - llmConfig.py - LLM Invoked
[INFO] 2024-10-22 15:41:56 - llmConfig.py - LLM Code Gen Invoked: OpenAI:gpt-4o-mini
[INFO] 2024-10-22 15:41:56 - loader.py - ragbuilder_loader Invoked:{'framework': 'langchain', 'retrieval_model': 'OpenAI:gpt-4o-mini', 'source_ids': [1], 'loader_kwargs': {'source': 'url', 'input_path': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}, 'chunking_kwargs': {'chunk_strategy': 'SemanticChunker'}, 'vectorDB_kwargs': {'vectorDB': 'chromaDB'}, 'embedding_kwargs': {'embedding_model': 'OpenAI:text-embedding-3-large'}, 'retriever_kwargs': {'retrievers': [{'retriever_type': 'vectorSimilarity', 'search_type': 'similarity', 'search_kwargs': '5'}, {'retriever_type': 'bm25Retriever', 'search_type': 'similarity', 'search_kwargs': '5'}, {'retriever_type': 'colbertRetriever', 'search_type': 'similarity', 'search_kwargs': '5'}], 'contextual_compression_retriever': True, 'document_compressor_pipeline': ['rankllm']}, 'input_path': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}
[INFO] 2024-10-22 15:41:56 - loader.py - classify_path Invoked
[INFO] 2024-10-22 15:41:56 - loader.py - Source type identified: url
[INFO] 2024-10-22 15:41:56 - loader.py - ragbuilder_url_loader Invoked
[INFO] 2024-10-22 15:41:56 - embedding.py - getEmbedding Invoked
[INFO] 2024-10-22 15:41:56 - embedding.py - OpenAIEmbeddings Invoked: OpenAI:text-embedding-3-large
[INFO] 2024-10-22 15:41:56 - langchain_chunking.py - SemanticChunker Invoked
[INFO] 2024-10-22 15:41:56 - vectordb.py - Chroma DB Loaded
[INFO] 2024-10-22 15:41:56 - vectordb.py - Chroma DB Index Created testindex-ragbuilder-1729611717124
[INFO] 2024-10-22 15:41:56 - common.py - {'code_string': "c=Chroma.from_documents(documents=splits, embedding=embedding, collection_name='testindex-ragbuilder-1729611717124', client_settings=chromadb.config.Settings(allow_reset=True))", 'import_string': 'from langchain_chroma import Chroma\nimport chromadb'}
[INFO] 2024-10-22 15:41:56 - retriever.py - Vector Retriever Invoked
[INFO] 2024-10-22 15:41:56 - retriever.py - BM25Retriever Retriever Invoked
[INFO] 2024-10-22 15:41:56 - retriever.py - Colbert Retriever Invoked
[INFO] 2024-10-22 15:41:56 - getCode.py - Codegen completed
[INFO] 2024-10-22 15:41:56 - executor.py - Creating RAG object from generated code...(this may take a while in some cases)
[INFO] 2024-10-22 15:44:54 - common.py - An error occurred: Batch size 43902 exceeds maximum batch size 41666
[INFO] 2024-10-22 15:44:55 - before_sleep.py - Retrying ragbuilder.executor._exec in 3.145816125260074 seconds as it returned None.
[INFO] 2024-10-22 15:48:22 - common.py - An error occurred: Batch size 43902 exceeds maximum batch size 41666
[INFO] 2024-10-22 15:48:23 - before_sleep.py - Retrying ragbuilder.executor._exec in 3.9425038479239305 seconds as it returned None.
[INFO] 2024-10-22 15:49:45 - common.py - INFO:     Shutting down
[INFO] 2024-10-22 15:49:46 - common.py - INFO:     Waiting for connections to close. (CTRL+C to force quit)
sanidhyabitcot commented 3 hours ago

Hi @aravind10x I ran the ragbuilder on a Linux system, using a Docker image to build and run the container. I tested with a small-sized PDF, keeping the default configurations and performing only 10 runs of Bayesian optimization. However, I also encountered the same RAGAS issue.

[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/site-packages/langchain_core/load/dump.py", line 69, in dumpd
[INFO] 2024-10-22 07:22:22 - common.py -     return json.loads(dumps(obj))
[INFO] 2024-10-22 07:22:22 - common.py -                       ^^^^^^^^^^
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/site-packages/langchain_core/load/dump.py", line 46, in dumps
[INFO] 2024-10-22 07:22:22 - common.py -     return json.dumps(obj, default=default, **kwargs)
[INFO] 2024-10-22 07:22:22 - common.py -            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/json/__init__.py", line 238, in dumps
[INFO] 2024-10-22 07:22:22 - common.py -     **kw).encode(obj)
[INFO] 2024-10-22 07:22:22 - common.py -           ^^^^^^^^^^^
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/json/encoder.py", line 200, in encode
[INFO] 2024-10-22 07:22:22 - common.py -     chunks = self.iterencode(o, _one_shot=True)
[INFO] 2024-10-22 07:22:22 - common.py -              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/json/encoder.py", line 258, in iterencode
[INFO] 2024-10-22 07:22:22 - common.py -     return _iterencode(o, 0)
[INFO] 2024-10-22 07:22:22 - common.py -            ^^^^^^^^^^^^^^^^^
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/site-packages/langchain_core/load/dump.py", line 18, in default
[INFO] 2024-10-22 07:22:22 - common.py -     return obj.to_json()
[INFO] 2024-10-22 07:22:22 - common.py -            ^^^^^^^^^^^^^
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/site-packages/langchain_core/runnables/base.py", line 2273, in to_json
[INFO] 2024-10-22 07:22:22 - common.py -     dumped = super().to_json()
[INFO] 2024-10-22 07:22:22 - common.py -              ^^^^^^^^^^^^^^^^^
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/site-packages/langchain_core/load/serializable.py", line 186, in to_json
[INFO] 2024-10-22 07:22:22 - common.py -     and _is_field_useful(self, k, v)
[INFO] 2024-10-22 07:22:22 - common.py -         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/site-packages/langchain_core/load/serializable.py", line 260, in _is_field_useful
[INFO] 2024-10-22 07:22:22 - common.py -     return field.required is True or value or field.get_default() != value
[INFO] 2024-10-22 07:22:22 - common.py -                                               ^^^^^^^^^^^^^^^^^^^
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/site-packages/pydantic/v1/fields.py", line 437, in get_default
[INFO] 2024-10-22 07:22:22 - common.py -     return smart_deepcopy(self.default) if self.default_factory is None else self.default_factory()
[INFO] 2024-10-22 07:22:22 - common.py -                                                                              ^^^^^^^^^^^^^^^^^^^^^^
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/site-packages/langchain_core/language_models/base.py", line 77, in _get_verbosity
[INFO] 2024-10-22 07:22:22 - common.py -     return get_verbose()
[INFO] 2024-10-22 07:22:22 - common.py -            ^^^^^^^^^^^^^
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/site-packages/langchain_core/globals.py", line 64, in get_verbose
[INFO] 2024-10-22 07:22:22 - common.py -     warnings.filterwarnings(
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/warnings.py", line 156, in filterwarnings
[INFO] 2024-10-22 07:22:22 - common.py -     message = re.compile(message, re.I)
[INFO] 2024-10-22 07:22:22 - common.py -               ^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/re/__init__.py", line 228, in compile
[INFO] 2024-10-22 07:22:22 - common.py -     return _compile(pattern, flags)
[INFO] 2024-10-22 07:22:22 - common.py -            ^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/re/__init__.py", line 283, in _compile
[INFO] 2024-10-22 07:22:22 - common.py -     flags = flags.value
[INFO] 2024-10-22 07:22:22 - common.py -             ^^^^^^^^^^^
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/enum.py", line 212, in __get__
[INFO] 2024-10-22 07:22:22 - common.py -     return self.fget(instance)
[INFO] 2024-10-22 07:22:22 - common.py -            ^^^^^^^^^^^^^^^^^^^
[INFO] 2024-10-22 07:22:22 - common.py - RecursionError: maximum recursion depth exceeded
[ERROR] 2024-10-22 07:22:22 - ragbuilder.py - Synthetic test data generation failed: The runner thread which was running the jobs raised an exeception. Read the traceback above to debug it. You can also pass `raise_exceptions=False` incase you want to show only a warning message instead.
[INFO] 2024-10-22 07:22:22 - common.py - INFO:     172.17.0.1:34634 - "POST /rbuilder HTTP/1.1" 500 Internal Server Error
[INFO] 2024-10-22 07:22:22 - common.py - ERROR:    Exception in ASGI application
[INFO] 2024-10-22 07:22:22 - common.py - Traceback (most recent call last):
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/site-packages/uvicorn/protocols/http/httptools_impl.py", line 399, in run_asgi
[INFO] 2024-10-22 07:22:22 - common.py -     result = await app(  # type: ignore[func-returns-value]
[INFO] 2024-10-22 07:22:22 - common.py -              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/site-packages/uvicorn/middleware/proxy_headers.py", line 70, in __call__
[INFO] 2024-10-22 07:22:22 - common.py -     return await self.app(scope, receive, send)
[INFO] 2024-10-22 07:22:22 - common.py -            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/site-packages/fastapi/applications.py", line 1054, in __call__
[INFO] 2024-10-22 07:22:22 - common.py -     await super().__call__(scope, receive, send)
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/site-packages/starlette/applications.py", line 113, in __call__
[INFO] 2024-10-22 07:22:22 - common.py -     await self.middleware_stack(scope, receive, send)
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/site-packages/starlette/middleware/errors.py", line 187, in __call__
[INFO] 2024-10-22 07:22:22 - common.py -     raise exc
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/site-packages/starlette/middleware/errors.py", line 165, in __call__
[INFO] 2024-10-22 07:22:22 - common.py -     await self.app(scope, receive, _send)
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
[INFO] 2024-10-22 07:22:22 - common.py -     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
[INFO] 2024-10-22 07:22:22 - common.py -     raise exc
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
[INFO] 2024-10-22 07:22:22 - common.py -     await app(scope, receive, sender)
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/site-packages/starlette/routing.py", line 715, in __call__
[INFO] 2024-10-22 07:22:22 - common.py -     await self.middleware_stack(scope, receive, send)
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/site-packages/starlette/routing.py", line 735, in app
[INFO] 2024-10-22 07:22:22 - common.py -     await route.handle(scope, receive, send)
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/site-packages/starlette/routing.py", line 288, in handle
[INFO] 2024-10-22 07:22:22 - common.py -     await self.app(scope, receive, send)
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/site-packages/starlette/routing.py", line 76, in app
[INFO] 2024-10-22 07:22:22 - common.py -     await wrap_app_handling_exceptions(app, request)(scope, receive, send)
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
[INFO] 2024-10-22 07:22:22 - common.py -     raise exc
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
[INFO] 2024-10-22 07:22:22 - common.py -     await app(scope, receive, sender)
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/site-packages/starlette/routing.py", line 73, in app
[INFO] 2024-10-22 07:22:22 - common.py -     response = await f(request)
[INFO] 2024-10-22 07:22:22 - common.py -                ^^^^^^^^^^^^^^^^
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/site-packages/fastapi/routing.py", line 301, in app
[INFO] 2024-10-22 07:22:22 - common.py -     raw_response = await run_endpoint_function(
[INFO] 2024-10-22 07:22:22 - common.py -                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/site-packages/fastapi/routing.py", line 214, in run_endpoint_function
[INFO] 2024-10-22 07:22:22 - common.py -     return await run_in_threadpool(dependant.call, **values)
[INFO] 2024-10-22 07:22:22 - common.py -            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/site-packages/starlette/concurrency.py", line 39, in run_in_threadpool
[INFO] 2024-10-22 07:22:22 - common.py -     return await anyio.to_thread.run_sync(func, *args)
[INFO] 2024-10-22 07:22:22 - common.py -            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/site-packages/anyio/to_thread.py", line 56, in run_sync
[INFO] 2024-10-22 07:22:22 - common.py -     return await get_async_backend().run_sync_in_worker_thread(
[INFO] 2024-10-22 07:22:22 - common.py -            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
[INFO] 2024-10-22 07:22:22 - common.py -     return await future
[INFO] 2024-10-22 07:22:22 - common.py -            ^^^^^^^^^^^^
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 943, in run
[INFO] 2024-10-22 07:22:22 - common.py -     result = context.run(func, *args)
[INFO] 2024-10-22 07:22:22 - common.py -              ^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/site-packages/ragbuilder/ragbuilder.py", line 475, in rbuilder_route
[INFO] 2024-10-22 07:22:22 - common.py -     result = parse_config(project_data.model_dump(), db)
[INFO] 2024-10-22 07:22:22 - common.py -              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/site-packages/ragbuilder/ragbuilder.py", line 599, in parse_config
[INFO] 2024-10-22 07:22:22 - common.py -     f_name=generate_data.generate_data(
[INFO] 2024-10-22 07:22:22 - common.py -            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/site-packages/ragbuilder/generate_data.py", line 94, in generate_data
[INFO] 2024-10-22 07:22:22 - common.py -     testset = generator.generate_with_langchain_docs(
[INFO] 2024-10-22 07:22:22 - common.py -               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/site-packages/ragas/testset/generator.py", line 179, in generate_with_langchain_docs
[INFO] 2024-10-22 07:22:22 - common.py -     return self.generate(
[INFO] 2024-10-22 07:22:22 - common.py -            ^^^^^^^^^^^^^^
[INFO] 2024-10-22 07:22:22 - common.py -   File "/usr/local/lib/python3.12/site-packages/ragas/testset/generator.py", line 274, in generate
[INFO] 2024-10-22 07:22:22 - common.py -     raise ExceptionInRunner()
[INFO] 2024-10-22 07:22:22 - common.py - ragas.exceptions.ExceptionInRunner: The runner thread which was running the jobs raised an exeception. Read the traceback above to debug it. You can also pass `raise_exceptions=False` incase you want to show only a warning message instead.