assafelovic / gpt-researcher

LLM based autonomous agent that conducts local and web research on any topic and generates a comprehensive report with citations.
https://gptr.dev
Apache License 2.0
14.74k stars 1.97k forks source link

openai BadRequest error #751

Open Ismael opened 2 months ago

Ismael commented 2 months ago
gpt-researcher_1  | Traceback (most recent call last):
gpt-researcher_1  |   File "/usr/local/lib/python3.11/site-packages/uvicorn/protocols/websockets/websockets_impl.py", line 244, in run_asgi
gpt-researcher_1  |     result = await self.app(self.scope, self.asgi_receive, self.asgi_send)  # type: ignore[func-returns-value]
gpt-researcher_1  |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
gpt-researcher_1  |   File "/usr/local/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 70, in __call__
gpt-researcher_1  |     return await self.app(scope, receive, send)
gpt-researcher_1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
gpt-researcher_1  |   File "/usr/local/lib/python3.11/site-packages/fastapi/applications.py", line 1054, in __call__
gpt-researcher_1  |     await super().__call__(scope, receive, send)
gpt-researcher_1  |   File "/usr/local/lib/python3.11/site-packages/starlette/applications.py", line 123, in __call__
gpt-researcher_1  |     await self.middleware_stack(scope, receive, send)
gpt-researcher_1  |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 151, in __call__
gpt-researcher_1  |     await self.app(scope, receive, send)
gpt-researcher_1  |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/cors.py", line 77, in __call__
gpt-researcher_1  |     await self.app(scope, receive, send)
gpt-researcher_1  |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 65, in __call__
gpt-researcher_1  |     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
gpt-researcher_1  |   File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
gpt-researcher_1  |     raise exc
gpt-researcher_1  |   File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
gpt-researcher_1  |     await app(scope, receive, sender)
gpt-researcher_1  |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 756, in __call__
gpt-researcher_1  |     await self.middleware_stack(scope, receive, send)
gpt-researcher_1  |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 776, in app
gpt-researcher_1  |     await route.handle(scope, receive, send)
gpt-researcher_1  |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 373, in handle
gpt-researcher_1  |     await self.app(scope, receive, send)
gpt-researcher_1  |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 96, in app
gpt-researcher_1  |     await wrap_app_handling_exceptions(app, session)(scope, receive, send)
gpt-researcher_1  |   File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
gpt-researcher_1  |     raise exc
gpt-researcher_1  |   File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
gpt-researcher_1  |     await app(scope, receive, sender)
gpt-researcher_1  |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 94, in app
gpt-researcher_1  |     await func(session)
gpt-researcher_1  |   File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 348, in app
gpt-researcher_1  |     await dependant.call(**values)
gpt-researcher_1  |   File "/usr/src/app/backend/server.py", line 89, in websocket_endpoint
gpt-researcher_1  |     report = await manager.start_streaming(
gpt-researcher_1  |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
gpt-researcher_1  |   File "/usr/src/app/backend/websocket_manager.py", line 60, in start_streaming
gpt-researcher_1  |     report = await run_agent(task, report_type, report_source, source_urls, tone, websocket, headers)
gpt-researcher_1  |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
gpt-researcher_1  |   File "/usr/src/app/backend/websocket_manager.py", line 97, in run_agent
gpt-researcher_1  |     report = await researcher.run()
gpt-researcher_1  |              ^^^^^^^^^^^^^^^^^^^^^^
gpt-researcher_1  |   File "/usr/src/app/backend/report_type/basic_report/basic_report.py", line 41, in run
gpt-researcher_1  |     await researcher.conduct_research()
gpt-researcher_1  |   File "/usr/src/app/gpt_researcher/master/agent.py", line 147, in conduct_research
gpt-researcher_1  |     self.context = await self.__get_context_by_search(self.query)
gpt-researcher_1  |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
gpt-researcher_1  |   File "/usr/src/app/gpt_researcher/master/agent.py", line 271, in __get_context_by_search
gpt-researcher_1  |     context = await asyncio.gather(
gpt-researcher_1  |               ^^^^^^^^^^^^^^^^^^^^^
gpt-researcher_1  |   File "/usr/src/app/gpt_researcher/master/agent.py", line 300, in __process_sub_query
gpt-researcher_1  |     content = await self.__get_similar_content_by_query(sub_query, scraped_data)
gpt-researcher_1  |               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
gpt-researcher_1  |   File "/usr/src/app/gpt_researcher/master/agent.py", line 398, in __get_similar_content_by_query
gpt-researcher_1  |     return await context_compressor.async_get_context(
gpt-researcher_1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
gpt-researcher_1  |   File "/usr/src/app/gpt_researcher/context/compression.py", line 56, in async_get_context
gpt-researcher_1  |     relevant_docs = await asyncio.to_thread(compressed_docs.invoke, query)
gpt-researcher_1  |                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
gpt-researcher_1  |   File "/usr/local/lib/python3.11/asyncio/threads.py", line 25, in to_thread
gpt-researcher_1  |     return await loop.run_in_executor(None, func_call)
gpt-researcher_1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
gpt-researcher_1  |   File "/usr/local/lib/python3.11/concurrent/futures/thread.py", line 58, in run
gpt-researcher_1  |     result = self.fn(*self.args, **self.kwargs)
gpt-researcher_1  |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
gpt-researcher_1  |   File "/usr/local/lib/python3.11/site-packages/langchain_core/retrievers.py", line 251, in invoke
gpt-researcher_1  |     raise e
gpt-researcher_1  |   File "/usr/local/lib/python3.11/site-packages/langchain_core/retrievers.py", line 244, in invoke
gpt-researcher_1  |     result = self._get_relevant_documents(
gpt-researcher_1  |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
gpt-researcher_1  |   File "/usr/local/lib/python3.11/site-packages/langchain/retrievers/contextual_compression.py", line 46, in _get_relevant_documents
gpt-researcher_1  |     compressed_docs = self.base_compressor.compress_documents(
gpt-researcher_1  |                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
gpt-researcher_1  |   File "/usr/local/lib/python3.11/site-packages/langchain/retrievers/document_compressors/base.py", line 37, in compress_documents
gpt-researcher_1  |     documents = _transformer.compress_documents(
gpt-researcher_1  |                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
gpt-researcher_1  |   File "/usr/local/lib/python3.11/site-packages/langchain/retrievers/document_compressors/embeddings_filter.py", line 72, in compress_documents
gpt-researcher_1  |     embedded_documents = _get_embeddings_from_stateful_docs(
gpt-researcher_1  |                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
gpt-researcher_1  |   File "/usr/local/lib/python3.11/site-packages/langchain_community/document_transformers/embeddings_redundant_filter.py", line 71, in _get_embeddings_from_stateful_docs
gpt-researcher_1  |     embedded_documents = embeddings.embed_documents(
gpt-researcher_1  |                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
gpt-researcher_1  |   File "/usr/local/lib/python3.11/site-packages/langchain_openai/embeddings/base.py", line 558, in embed_documents
gpt-researcher_1  |     return self._get_len_safe_embeddings(texts, engine=engine)
gpt-researcher_1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
gpt-researcher_1  |   File "/usr/local/lib/python3.11/site-packages/langchain_openai/embeddings/base.py", line 456, in _get_len_safe_embeddings
gpt-researcher_1  |     response = self.client.create(
gpt-researcher_1  |                ^^^^^^^^^^^^^^^^^^^
gpt-researcher_1  |   File "/usr/local/lib/python3.11/site-packages/openai/resources/embeddings.py", line 114, in create
gpt-researcher_1  |     return self._post(
gpt-researcher_1  |            ^^^^^^^^^^^
gpt-researcher_1  |   File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1259, in post
gpt-researcher_1  |     return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
gpt-researcher_1  |                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
gpt-researcher_1  |   File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 936, in request
gpt-researcher_1  |     return self._request(
gpt-researcher_1  |            ^^^^^^^^^^^^^^
gpt-researcher_1  |   File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1040, in _request
gpt-researcher_1  |     raise self._make_status_error_from_response(err.response) from None
gpt-researcher_1  | openai.BadRequestError: Error code: 400 - {'error': {'message': "'input' : input must be a string or an array of strings", 'type': 'invalid_request_error'}}

I'm using the docker-compose with this env

 OPENAI_BASE_URL: https://api.groq.com/openai/v1
 LLM_PROVIDER: openai
 FAST_LLM_MODEL: llama-3.1-70b-versatile
 SMART_LLM_MODEL: llama-3.1-70b-versatile 
 OPENAI_API_KEY: <api key here>
 TAVILY_API_KEY: <api key here>

I did try to use LLM_PROVIDER: groq but that failed with an error that the openai key wasn't set.

ElishaKay commented 2 months ago

Welcome @Ismael

Which LLM would you like to use? You can have a look at the relevant keys to set within the gpt_researcher/llm_provider folder

You'll want to replace the : with = in your .env

For example, the minimalistic .env is:

TAVILY_API_KEY=_____________
OPENAI_API_KEY=_____________

Replace _____ with the relevant keys

Ismael commented 2 months ago

I'd like to use groq + llama 3 I think the issue is that it's trying to use some embeddings and that's not available on groq

ElishaKay commented 2 months ago

Have a look at the env examples here

Search the codebase for the alternative env variables related to groq embedding models.

I.e. what are the alternatives for:

OLLAMA_EMBEDDING_MODEL=all-minilm:22m EMBEDDING_PROVIDER=ollama