assafelovic / gpt-researcher

GPT based autonomous agent that does online comprehensive research on any given topic
https://gptr.dev
MIT License
13.19k stars 1.65k forks source link

Error when running with docker #137

Closed lkyhfx closed 9 months ago

lkyhfx commented 11 months ago
  File "/usr/src/app/agent/run.py", line 50, in run_agent
2023-08-17T07:20:35.442545533Z     await assistant.conduct_research()
2023-08-17T07:20:35.442547293Z   File "/usr/src/app/agent/research_agent.py", line 143, in conduct_research
2023-08-17T07:20:35.442549203Z     research_result = await self.run_search_summary(query)
2023-08-17T07:20:35.442550993Z                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2023-08-17T07:20:35.442552853Z   File "/usr/src/app/agent/research_agent.py", line 128, in run_search_summary
2023-08-17T07:20:35.442554863Z     os.makedirs(os.path.dirname(f"./outputs/{self.directory_name}/research-{query}.txt"), exist_ok=True)
2023-08-17T07:20:35.442556793Z   File "<frozen os>", line 225, in makedirs
2023-08-17T07:20:35.442558993Z PermissionError: [Errno 13] Permission denied: './outputs/89a73f7e-705a-4ee6-9583-5f78845a2fcb

Try to use gpt-reasercher with docker and query some simple question but encounter this error.

TC23345 commented 11 months ago

I have the same exact error! Hmmmm

This is ChatGPT's response:

The error message suggests that the directory /usr/src/app/outputs does not exist when trying to change ownership.

From the Dockerfile you've provided, it seems you've removed the line to create the outputs directory. The chown command is trying to change the ownership of a directory that doesn't exist, thus resulting in an error.

To fix this, you need to re-add the line to create the outputs directory before you try to change its ownership.

I dunno if thats a very accurate solution or not

I think it comes down to windows hating us

assafelovic commented 10 months ago

Hey @TC23345 this issue is currently known on Windows apparently. Docker is trying to create a directory without permissions from the root user. Can you try running in terminal a simple command to mkdir a directory and see if it persists?

alexisdal commented 10 months ago

@assafelovic

I have exactly the same issue. running on docker 4.2.0. I'm upgradeted to 4.22.1 but reproduced the same behavior

I think it occuers when gpt-researcher is trying to output the result.

In the log window I see c:\Users\alex\Desktop\gpt-researcher>docker-compose up Recreating gpt-researcher_gpt-researcher_1 ... done Attaching to gpt-researcher_gpt-researcher_1 gpt-researcher_1 | INFO: Started server process [1] gpt-researcher_1 | INFO: Waiting for application startup. gpt-researcher_1 | INFO: Application startup complete. gpt-researcher_1 | INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit) gpt-researcher_1 | INFO: 172.21.0.1:42406 - "GET / HTTP/1.1" 200 OK gpt-researcher_1 | INFO: ('172.21.0.1', 42408) - "WebSocket /ws" [accepted] gpt-researcher_1 | ERROR: Exception in ASGI application ... gpt-researcher_1 | File "/usr/src/app/agent/research_agent.py", line 128, in run_search_summary gpt-researcher_1 | os.makedirs(os.path.dirname(f"./outputs/{self.directory_name}/research-{query}.txt"), exist_ok=True) gpt-researcher_1 | File "", line 225, in makedirs gpt-researcher_1 | PermissionError: [Errno 13] Permission denied: './outputs/227403b7-d6f2-4468-a4d1-ae8c17c1a406'

I did as instructed and open a bash window from Docker and I reproduce the same behavior in the bash window

$ ls CODE_OF_CONDUCT.md Dockerfile README.md actions client docker-compose.yml main.py processing venv CONTRIBUTING.md LICENSE pycache agent config js outputs requirements.txt $ cd outputs $ ls 11016334-9ca4-4bf7-9748-5417d603363a 690c5c52-50fb-4cf3-8ee7-ea165e97b778 adad9a83-b544-402b-9b6d-b83bd0df87d8 1c519e4c-22f6-4dc2-864f-58c018fd77b9 8aa7f8c7-080d-437f-b9dc-eccd047d5b57 5792f061-56f7-49c3-9416-cdbcf20f4332 a9da62ce-028f-4f12-af38-639dcb0ba6ef $ mkdir testdir mkdir: cannot create directory ‘testdir’: Permission denied $

For the record, I tried the docker-compose up cmd from a normal cmd window + also when start a cmd window as windows admin => same result.

Here is some additional context about why I'm trying with docker in the first place...

I tried gpt-researcher about 2 weeks ago and I was quite pleased with it. I was running it on windows natively without using docker at all. However, I think it was a few days ago, I started having "chrome crashed" errors in gpt-researcher log instead of nice reports. Looking in the README it says many people have chrome-driver issues and suggests to downgrade chrome. However, my chrome has automatic updates and I intend to keep it that way. indeed, over the past 2 weeks, I think chrome upgraded twice silently (now running 116.0.5845.111 (Official Build) (64-bit)) so I'm not really comfortable with degrading, let alone sticking to a specific version. I did try to see if I could upgrade the chrome-driver version or at least identify a chrome version very close to mine. looking into the code dependancies gpt-researcher\venv\Lib\site-packages\webdriver_manager\drivers\chrome.py it looks for versions in https://googlechromelabs.github.io/chrome-for-testing/known-good-versions-with-downloads.json (line 67) and in this json I notice 116.0.5845.96 => so a .96 and not a .111 => i'm ok doing this slight downgrade for now. Looking how to perform this downgrade, it looked like google does not really allow you to do this (to be confirmed). The first result seems to be based on 3rd party software => there is no way I'm using 3rd party software for chrome upgrades/downgrades.

So I edited config/config.py file and changed self.selenium_web_browser = os.getenv("USE_WEB_BROWSER", "chrome") to self.selenium_web_browser = os.getenv("USE_WEB_BROWSER", "firefox") (line 23)

so now the reports can be written again, but no web scrapping seems to work. I always get errors in the log like An error occurred while processing the url https://www.businesscoot.com/en/study/the-accounting-software-market-france: Message: Error: NS_ERROR_UNKNOWN_HOST Stacktrace: [...] or Scraping url https://www.frenchbusinessadvice.com/accounting-guides/ with question Daily activities of an accountant and a chartered accountant in France An error occurred while processing the url https://www.ifac.org/about-ifac/membership/profile/france: Message: 'geckodriver.exe' executable may have wrong permissions.

So I tried "edge" (running Version 116.0.1938.62 (Official build) (64-bit))

This time, there is no error message in the log... but the reports starts with saying: "Due to the unavailability of the requested URLs, I am unable to process the direct information from the listed pages. However, based on general knowledge about xxx, I can provide an analysis." So I think it did not work either because the initial reports I got from gpt-research were a lot more elaborate, would cite sources, etc.

I have not tried to download/install opera and see what happens. I would prefer that we fix the docker File permission error instead. I like the idea to have everything tidy inside a container. It's probably something stupid either in the Docker file or the docker-compose.yml file regarding permissions with the filesystem sharing. something like this.

I just hope this more detailed feedback helps solve the issue 🤞

lkyhfx commented 10 months ago

Hi @assafelovic, I am using wsl2 on windows 11. I have to run the docker contain using root user to handle the permission error. After added user: root inside docker-compose.yml, gpt-researcher run as expected. I am not very familiar with docker, but I enter the gpt-researcher container's shell, try to create file inside outputs, the permission error is also encounter. Perhaps something wrong in user permission configuration?

assafelovic commented 10 months ago

The error is due to a permissions issue with creating the outputs directory. The directory does not initially exist and is generated in runtime. I just added a proposed fix to latest version #158 . Can you please try pulling and try again?

If that doesn't work, I suggest trying to install regularly (see README) and manually adding the outputs folder to better debug where the issue is coming from. Lmk if this solves it!

@lkyhfx @alexisdal @TC23345

alexisdal commented 10 months ago

Hi @assafelovic

I was under the feeling that version #158 had been merged into master, so I just tried deleting my entire gpt-researcher directory and did the git clone and docker-compose up commands again as instructed in the README

I hid my openai key for obvious reason but I can see the runtime seems to work until the result needs to be written in 'output'

check output below

Not sure if that's relevant by i'm using cygwin64 and bash because the default terminal in windows is terrible to work with (cmd)

$ rm -rf gpt-researcher

$ rm -rf gpt-researcher

$ git clone https://github.com/assafelovic/gpt-researcher.git
Cloning into 'gpt-researcher'...
remote: Enumerating objects: 784, done.
remote: Counting objects: 100% (463/463), done.
remote: Compressing objects: 100% (189/189), done.
remote: Total 784 (delta 317), reused 357 (delta 266), pack-reused 321
Receiving objects: 100% (784/784), 1.54 MiB | 8.31 MiB/s, done.
Resolving deltas: 100% (439/439), done.

$ cd gpt-researcher/

$ export OPENAI_API_KEY="************"

$ docker-compose up
[+] Running 2/2
 ✔ Network gpt-researcher_default             Created                                                                                                                                                                0.1s
 ✔ Container gpt-researcher-gpt-researcher-1  Created                                                                                                                                                                0.1s
Attaching to gpt-researcher-gpt-researcher-1
gpt-researcher-gpt-researcher-1  | INFO:     Started server process [1]
gpt-researcher-gpt-researcher-1  | INFO:     Waiting for application startup.
gpt-researcher-gpt-researcher-1  | INFO:     Application startup complete.
gpt-researcher-gpt-researcher-1  | INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
gpt-researcher-gpt-researcher-1  | INFO:     172.21.0.1:59976 - "GET / HTTP/1.1" 200 OK
gpt-researcher-gpt-researcher-1  | INFO:     ('172.21.0.1', 59978) - "WebSocket /ws" [accepted]
gpt-researcher-gpt-researcher-1  | ERROR:    Exception in ASGI application
gpt-researcher-gpt-researcher-1  | Traceback (most recent call last):
gpt-researcher-gpt-researcher-1  |   File "/usr/local/lib/python3.11/site-packages/uvicorn/protocols/websockets/wsproto_impl.py", line 234, in run_asgi
gpt-researcher-gpt-researcher-1  |     result = await self.app(self.scope, self.receive, self.send)
gpt-researcher-gpt-researcher-1  |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
gpt-researcher-gpt-researcher-1  |   File "/usr/local/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 84, in __call__
gpt-researcher-gpt-researcher-1  |     return await self.app(scope, receive, send)
gpt-researcher-gpt-researcher-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
gpt-researcher-gpt-researcher-1  |   File "/usr/local/lib/python3.11/site-packages/fastapi/applications.py", line 292, in __call__
gpt-researcher-gpt-researcher-1  |     await super().__call__(scope, receive, send)
gpt-researcher-gpt-researcher-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/applications.py", line 122, in __call__
gpt-researcher-gpt-researcher-1  |     await self.middleware_stack(scope, receive, send)
gpt-researcher-gpt-researcher-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 149, in __call__
gpt-researcher-gpt-researcher-1  |     await self.app(scope, receive, send)
gpt-researcher-gpt-researcher-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 79, in __call__
gpt-researcher-gpt-researcher-1  |     raise exc
gpt-researcher-gpt-researcher-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 68, in __call__
gpt-researcher-gpt-researcher-1  |     await self.app(scope, receive, sender)
gpt-researcher-gpt-researcher-1  |   File "/usr/local/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 20, in __call__
gpt-researcher-gpt-researcher-1  |     raise e
gpt-researcher-gpt-researcher-1  |   File "/usr/local/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 17, in __call__
gpt-researcher-gpt-researcher-1  |     await self.app(scope, receive, send)
gpt-researcher-gpt-researcher-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 718, in __call__
gpt-researcher-gpt-researcher-1  |     await route.handle(scope, receive, send)
gpt-researcher-gpt-researcher-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 341, in handle
gpt-researcher-gpt-researcher-1  |     await self.app(scope, receive, send)
gpt-researcher-gpt-researcher-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 82, in app
gpt-researcher-gpt-researcher-1  |     await func(session)
gpt-researcher-gpt-researcher-1  |   File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 324, in app
gpt-researcher-gpt-researcher-1  |     await dependant.call(**values)
gpt-researcher-gpt-researcher-1  |   File "/usr/src/app/main.py", line 60, in websocket_endpoint
gpt-researcher-gpt-researcher-1  |     await manager.start_streaming(task, report_type, agent, agent_role_prompt, websocket)
gpt-researcher-gpt-researcher-1  |   File "/usr/src/app/agent/run.py", line 38, in start_streaming
gpt-researcher-gpt-researcher-1  |     report, path = await run_agent(task, report_type, agent, agent_role_prompt, websocket)
gpt-researcher-gpt-researcher-1  |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
gpt-researcher-gpt-researcher-1  |   File "/usr/src/app/agent/run.py", line 50, in run_agent
gpt-researcher-gpt-researcher-1  |     await assistant.conduct_research()
gpt-researcher-gpt-researcher-1  |   File "/usr/src/app/agent/research_agent.py", line 143, in conduct_research
gpt-researcher-gpt-researcher-1  |     research_result = await self.run_search_summary(query)
gpt-researcher-gpt-researcher-1  |                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
gpt-researcher-gpt-researcher-1  |   File "/usr/src/app/agent/research_agent.py", line 128, in run_search_summary
gpt-researcher-gpt-researcher-1  |     os.makedirs(os.path.dirname(f"./outputs/{self.directory_name}/research-{query}.txt"), exist_ok=True)
gpt-researcher-gpt-researcher-1  |   File "<frozen os>", line 225, in makedirs
gpt-researcher-gpt-researcher-1  | PermissionError: [Errno 13] Permission denied: './outputs/cb77dbe5-ad40-4a22-8ef0-cec2464b4fbb'
Gracefully stopping... (press Ctrl+C again to force)
Aborting on container exit...
[+] Stopping 0/1
[+] Stopping 1/1-researcher-gpt-researcher-1  Stopping                                                                                                                                                               1.0s
[+] Killing 1/1t-researcher-gpt-researcher-1  Killing                                                                                                                                                                0.8s
 ✔ Container gpt-researcher-gpt-researcher-1  Killed                                                                                                                                                                 0.8s
time="2023-09-07T17:22:30+02:00" level=error msg="got 3 SIGTERM/SIGINTs, forcing shutdown"

$ git log | head -n 10
commit b89fdaa0aee4e710af594cd82ca88792d1b88e0a
Author: Assaf Elovic <assaf.elovic@gmail.com>
Date:   Thu Sep 7 15:54:20 2023 +0300

    Update README.md

    Removed troubleshooting issues from main README and instead added guidance.

commit 9083223e79fed2201e0ea6313f533f7dc765376e
Merge: 85821b1 0f9f962

$
assafelovic commented 9 months ago

This seems like a local permissions issue. Or in other words an environment problem on your end. Closing for now or until further opened from others.

Dipghosh-tpu commented 1 month ago

I am running GPT researcher with the below pip install

pip install gpt-researcher

and the below python code

from gpt_researcher import GPTResearcher

query = "why is the latest research on Tabular GANs and what are the scopes on improvements on tabular GANs?" researcher = GPTResearcher(query=query, report_type="research_report")

Conduct research on the given query

research_result = await researcher.conduct_research()

Write the report

report = await researcher.write_report()

Getting this error :-

Traceback (most recent call last):

File c:\Users\dipgh\anaconda3\lib\site-packages\IPython\core\interactiveshell.py:3369 in run_code exec(code_obj, self.user_global_ns, self.user_ns)

Input In [2] in <cell line: 2> from gpt_researcher import GPTResearcher

File c:\Users\dipgh\anaconda3\lib\site-packages\gpt_researcher__init__.py:1 in from .master import GPTResearcher

File c:\Users\dipgh\anaconda3\lib\site-packages\gpt_researcher\master__init__.py:1 in from .agent import GPTResearcher

File c:\Users\dipgh\anaconda3\lib\site-packages\gpt_researcher\master\agent.py:7 in from gpt_researcher.master.functions import *

File c:\Users\dipgh\anaconda3\lib\site-packages\gpt_researcher\master\functions.py:21 match retriever: ^ SyntaxError: invalid syntax

Getting the same error while running the command uvicorn main:app --reload after git cloning , creating new environment with necessary packages and populating the necessary env variables for the OpenAI and Tavily API keys.

Can someone pls help?