assafelovic / gpt-researcher

LLM based autonomous agent that conducts local and web research on any topic and generates a comprehensive report with citations.
https://gptr.dev
Apache License 2.0
15.04k stars 2.02k forks source link

Web search -- `RecursionError: maximum recursion depth exceeded in comparison` #990

Open franckess opened 6 days ago

franckess commented 6 days ago

When running the tool using Web I am facing the issue below.

ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/usr/local/anaconda3/envs/poc/lib/python3.10/site-packages/uvicorn/protocols/websockets/websockets_impl.py", line 242, in run_asgi
    result = await self.app(self.scope, self.asgi_receive, self.asgi_send)  # type: ignore[func-returns-value]
  File "/usr/local/anaconda3/envs/poc/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
    return await self.app(scope, receive, send)
  File "/usr/local/anaconda3/envs/poc/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in __call__
    await super().__call__(scope, receive, send)
  File "/usr/local/anaconda3/envs/poc/lib/python3.10/site-packages/starlette/applications.py", line 113, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/usr/local/anaconda3/envs/poc/lib/python3.10/site-packages/starlette/middleware/errors.py", line 152, in __call__
    await self.app(scope, receive, send)
  File "/usr/local/anaconda3/envs/poc/lib/python3.10/site-packages/starlette/middleware/cors.py", line 77, in __call__
    await self.app(scope, receive, send)
  File "/usr/local/anaconda3/envs/poc/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  File "/usr/local/anaconda3/envs/poc/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    raise exc
  File "/usr/local/anaconda3/envs/poc/lib/python3.10/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
    await app(scope, receive, sender)
  File "/usr/local/anaconda3/envs/poc/lib/python3.10/site-packages/starlette/routing.py", line 715, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/usr/local/anaconda3/envs/poc/lib/python3.10/site-packages/starlette/routing.py", line 735, in app
    await route.handle(scope, receive, send)
  File "/usr/local/anaconda3/envs/poc/lib/python3.10/site-packages/starlette/routing.py", line 362, in handle
    await self.app(scope, receive, send)
  File "/usr/local/anaconda3/envs/poc/lib/python3.10/site-packages/starlette/routing.py", line 95, in app
    await wrap_app_handling_exceptions(app, session)(scope, receive, send)
  File "/usr/local/anaconda3/envs/poc/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    raise exc
  File "/usr/local/anaconda3/envs/poc/lib/python3.10/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
    await app(scope, receive, sender)
  File "/usr/local/anaconda3/envs/poc/lib/python3.10/site-packages/starlette/routing.py", line 93, in app
    await func(session)
  File "/usr/local/anaconda3/envs/poc/lib/python3.10/site-packages/fastapi/routing.py", line 383, in app
    await dependant.call(**solved_result.values)
  File "/Users/matt/Agentic-RAG-PoC/backend/server/server.py", line 113, in websocket_endpoint
    await handle_websocket_communication(websocket, manager)
  File "/Users/matt/Agentic-RAG-PoC/backend/server/server_utils.py", line 124, in handle_websocket_communication
    await handle_start_command(websocket, data, manager)
  File "/Users/matt/Agentic-RAG-PoC/backend/server/server_utils.py", line 28, in handle_start_command
    report = await manager.start_streaming(
  File "/Users/matt/Agentic-RAG-PoC/backend/server/websocket_manager.py", line 66, in start_streaming
    report = await run_agent(task, report_type, report_source, source_urls, tone, websocket, headers = headers, config_path = config_path)
  File "/Users/matt/Agentic-RAG-PoC/backend/server/websocket_manager.py", line 108, in run_agent
    report = await researcher.run()
  File "/Users/matt/Agentic-RAG-PoC/backend/report_type/basic_report/basic_report.py", line 41, in run
    await researcher.conduct_research()
  File "/Users/matt/Agentic-RAG-PoC/gpt_researcher/agent.py", line 96, in conduct_research
    self.context = await self.research_conductor.conduct_research()
  File "/Users/matt/Agentic-RAG-PoC/gpt_researcher/skills/researcher.py", line 74, in conduct_research
    self.researcher.context = await self.__get_context_by_search(self.researcher.query)
  File "/Users/matt/Agentic-RAG-PoC/gpt_researcher/skills/researcher.py", line 162, in __get_context_by_search
    context = await asyncio.gather(
  File "/Users/matt/Agentic-RAG-PoC/gpt_researcher/skills/researcher.py", line 223, in __process_sub_query
    content = await self.researcher.context_manager.get_similar_content_by_query(sub_query, scraped_data)
  File "/Users/matt/Agentic-RAG-PoC/gpt_researcher/skills/context_manager.py", line 26, in get_similar_content_by_query
    return await context_compressor.async_get_context(
  File "/Users/matt/Agentic-RAG-PoC/gpt_researcher/context/compression.py", line 85, in async_get_context
    relevant_docs = await asyncio.to_thread(compressed_docs.invoke, query)
    File "/usr/local/anaconda3/envs/poc/lib/python3.10/asyncio/threads.py", line 25, in to_thread
    return await loop.run_in_executor(None, func_call)
  File "/usr/local/anaconda3/envs/poc/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/local/anaconda3/envs/poc/lib/python3.10/site-packages/langchain_core/retrievers.py", line 254, in invoke
    raise e
  File "/usr/local/anaconda3/envs/poc/lib/python3.10/site-packages/langchain_core/retrievers.py", line 247, in invoke
    result = self._get_relevant_documents(
  File "/usr/local/anaconda3/envs/poc/lib/python3.10/site-packages/langchain/retrievers/contextual_compression.py", line 48, in _get_relevant_documents
    compressed_docs = self.base_compressor.compress_documents(
  File "/usr/local/anaconda3/envs/poc/lib/python3.10/site-packages/langchain/retrievers/document_compressors/base.py", line 45, in compress_documents
    documents = _transformer.transform_documents(documents)
  File "/usr/local/anaconda3/envs/poc/lib/python3.10/site-packages/langchain_text_splitters/base.py", line 218, in transform_documents
    return self.split_documents(list(documents))
  File "/usr/local/anaconda3/envs/poc/lib/python3.10/site-packages/langchain_text_splitters/base.py", line 96, in split_documents
    return self.create_documents(texts, metadatas=metadatas)
  File "/usr/local/anaconda3/envs/poc/lib/python3.10/site-packages/langchain_text_splitters/base.py", line 80, in create_documents
    metadata = copy.deepcopy(_metadatas[i])
  File "/usr/local/anaconda3/envs/poc/lib/python3.10/copy.py", line 146, in deepcopy
    y = copier(x, memo)
  File "/usr/local/anaconda3/envs/poc/lib/python3.10/copy.py", line 231, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/usr/local/anaconda3/envs/poc/lib/python3.10/copy.py", line 172, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/usr/local/anaconda3/envs/poc/lib/python3.10/copy.py", line 271, in _reconstruct
    state = deepcopy(state, memo)
  File "/usr/local/anaconda3/envs/poc/lib/python3.10/copy.py", line 146, in deepcopy
    y = copier(x, memo)
  File "/usr/local/anaconda3/envs/poc/lib/python3.10/copy.py", line 231, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
ElishaKay commented 5 days ago

Welcome @franckess

The path of least resistance to run on any machine is the Docker Quickstart

For running as part of your python project:

Perhaps the virtual environment path will work

franckess commented 4 days ago

Hi @ElishaKay,

cc @assafelovic

Thank you for the prompt reply.

I have got it to work using my-docs folder, but if use Web or Hybrid then I get this error.

After doing some digging, the main issue comes from this step:

relevant_docs = await asyncio.to_thread(compressed_docs.invoke, query)

My initial workaround was to implement this:

# Check the length of self.documents and set the maximum depth of the Python interpreter stack if necessary
if len(self.documents) > 1000:
    required_stack_depth = -(-len(self.documents) // 100) * 100  # Calculate the nearest ceiling of 100
    sys.setrecursionlimit(required_stack_depth)