Closed JessChud closed 4 months ago
[eval_gpt3.5_0125_preview]
test=1
parsed as
{'eval_gpt3': {'5_0125_preview': {'test': 1}}}
Dot is the issue. remove the dot and check
Thanks -- I made the change as recommended to say eval_gpt35_0125_preview instead and here's the error trace I'm getting now:
JessicaComputer:OpenDevin Jessica$ evaluation/swe_bench/scripts/run_infer.sh eval_gpt35_0125_preview CodeActAgent 1
AGENT: CodeActAgent AGENT_VERSION: v1.4 MODEL_CONFIG: eval_gpt35_0125_preview EVAL_LIMIT: 1 10:19:16 - opendevin.core.config:INFO: config.py:431 - Loading llm config from eval_gpt35_0125_preview 10:19:16 - opendevin.core.config:ERROR: config.py:438 - Config file not found: [Errno 2] No such file or directory: 'config.toml' 10:19:16 - opendevin:INFO: run_infer.py:330 - Config for evaluation: AppConfig(llm=LLMConfig(model='gpt-3.5-turbo', api_key='**', base_url=None, api_version=None, embedding_model='local', embedding_base_url=None, embedding_deployment_name=None, aws_access_key_id='**', aws_secret_access_key='**', aws_region_name=None, num_retries=5, retry_min_wait=3, retry_max_wait=60, timeout=None, max_chars=5000000, temperature=0, top_p=0.5, custom_llm_provider=None, max_input_tokens=None, max_output_tokens=None), agent=AgentConfig(name='CodeActAgent', memory_enabled=False, memory_max_threads=2), runtime='server', file_store='memory', file_store_path='/tmp/file_store', workspace_base='/Users/Jessica/Downloads/OpenDevin/workspace', workspace_mount_path='/Users/Jessica/Downloads/OpenDevin/workspace', workspace_mount_path_in_sandbox='/workspace', workspace_mount_rewrite=None, cache_dir='/tmp/cache', sandbox_container_image='ghcr.io/opendevin/sandbox:main', run_as_devin=True, max_iterations=100, e2b_api_key='**', sandbox_type='ssh', use_host_network=False, ssh_hostname='localhost', disable_color=False, sandbox_user_id=502, sandbox_timeout=120, github_token='**', jwt_secret='a5b4c9b586bc4f8fab1d120354beb167', debug=False, enable_auto_lint=False 10:19:16 - opendevin:INFO: run_infer.py:353 - Using evaluation output directory: evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo_maxiter_50_N_v1.4 10:19:16 - opendevin:INFO: run_infer.py:366 - Metadata: {'agent_class': 'CodeActAgent', 'model_name': 'gpt-3.5-turbo', 'max_iterations': 50, 'eval_output_dir': 'evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo_maxiter_50_N_v1.4', 'start_time': '2024-05-30 10:19:16', 'git_commit': '6ff50ed369163592041fdda5a7e9702ce79a17cc'} 10:19:16 - opendevin:INFO: run_infer.py:374 - Limiting evaluation to first 1 instances. 10:19:16 - opendevin:INFO: run_infer.py:378 - Writing evaluation output to evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo_maxiter_50_N_v1.4/output.jsonl 10:19:16 - opendevin:WARNING: run_infer.py:385 - Output file evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo_maxiter_50_N_v1.4/output.jsonl already exists. Loaded 0 finished instances. 10:19:16 - opendevin:INFO: run_infer.py:390 - Evaluation started with Agent CodeActAgent, model gpt-3.5-turbo, max iterations 50. 10:19:16 - opendevin:INFO: run_infer.py:406 - Finished instances: 0, Remaining instances: 1 0%| | 0/1 [00:00<?, ?it/s]10:19:16 - opendevin:INFO: run_infer.py:427 - Using 8 workers for evaluation. 10:19:16 - opendevin:INFO: run_infer.py:431 - Skipping workspace mount: True 10:19:33 - opendevin:INFO: run_infer.py:214 - Starting evaluation for instance django__django-15202. Hint: run "tail -f evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo_maxiter_50_N_v1.4/logs/instance_djangodjango-15202.log" to see live logs in a seperate shell 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 1/1 [00:23<00:00, 23.26s/it]ERROR:concurrent.futures:exception calling callback for <Future at 0x130eebed0 state=finished raised EOF> concurrent.futures.process._RemoteTraceback: """ Traceback (most recent call last): File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/process.py", line 261, in _process_worker r = call_item.fn(*call_item.args, **call_item.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/run_infer.py", line 234, in process_instance sandbox = SWEBenchSSHBox.get_box_for_instance( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/swe_env_box.py", line 96, in get_box_for_instance sandbox = cls( ^^^^ File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/swe_env_box.py", line 61, in init__ exit_code, output = self.execute('source /swe_util/swe_entry.sh', timeout=600) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/Jessica/Downloads/OpenDevin/opendevin/runtime/docker/ssh_box.py", line 440, in execute success = self.ssh.prompt(timeout=timeout) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/Jessica/Library/Caches/pypoetry/virtualenvs/opendevin-Ilj8wfey-py3.11/lib/python3.11/site-packages/pexpect/pxssh.py", line 506, in prompt i = self.expect([self.PROMPT, TIMEOUT], timeout=timeout) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/Jessica/Library/Caches/pypoetry/virtualenvs/opendevin-Ilj8wfey-py3.11/lib/python3.11/site-packages/pexpect/spawnbase.py", line 354, in expect return self.expect_list(compiled_pattern_list, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/Jessica/Library/Caches/pypoetry/virtualenvs/opendevin-Ilj8wfey-py3.11/lib/python3.11/site-packages/pexpect/spawnbase.py", line 383, in expect_list return exp.expect_loop(timeout) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/Jessica/Library/Caches/pypoetry/virtualenvs/opendevin-Ilj8wfey-py3.11/lib/python3.11/site-packages/pexpect/expect.py", line 179, in expect_loop return self.eof(e) ^^^^^^^^^^^ File "/Users/Jessica/Library/Caches/pypoetry/virtualenvs/opendevin-Ilj8wfey-py3.11/lib/python3.11/site-packages/pexpect/expect.py", line 122, in eof raise exc pexpect.exceptions.EOF: End Of File (EOF). Empty string style platform. <pexpect.pxssh.pxssh object at 0x134866910> command: /usr/bin/ssh args: [b'/usr/bin/ssh', b'-q', b'-p', b'61976', b'-l', b'opendevin', b'localhost'] buffer (last 100 chars): '' before (last 100 chars): "Error: This script is intended to be run by the 'root' user only.\r\n" after: <class 'pexpect.exceptions.EOF'> match: None match_index: None exitstatus: None flag_eof: True pid: 25042 child_fd: 35 closed: False timeout: 220 delimiter: <class 'pexpect.exceptions.EOF'> logfile: None logfile_read: None logfile_send: None maxread: 2000 ignorecase: False searchwindowsize: None delaybeforesend: 0.05 delayafterclose: 0.1 delayafterterminate: 0.1 searcher: searcher_re: 0: re.compile('\[PEXPECT\][\$\#] ') 1: TIMEOUT """
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 340, in _invoke_callbacks
callback(self)
File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/run_infer.py", line 416, in update_progress
output = future.result()
^^^^^^^^^^^^^^^
File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 449, in result
return self.get_result()
^^^^^^^^^^^^^^^^^^^
File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 401, in get_result
raise self._exception
File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/run_infer.py", line 452, in
ERROR:root:<class 'pexpect.exceptions.EOF'>: End Of File (EOF). Empty string style platform.
<pexpect.pxssh.pxssh object at 0x134866910>
command: /usr/bin/ssh
args: [b'/usr/bin/ssh', b'-q', b'-p', b'61976', b'-l', b'opendevin', b'localhost']
buffer (last 100 chars): ''
before (last 100 chars): "Error: This script is intended to be run by the 'root' user only.\r\n"
after: <class 'pexpect.exceptions.EOF'>
match: None
match_index: None
exitstatus: None
flag_eof: True
pid: 25042
child_fd: 35
closed: False
timeout: 220
delimiter: <class 'pexpect.exceptions.EOF'>
logfile: None
logfile_read: None
logfile_send: None
maxread: 2000
ignorecase: False
searchwindowsize: None
delaybeforesend: 0.05
delayafterclose: 0.1
delayafterterminate: 0.1
searcher: searcher_re:
0: re.compile('\[PEXPECT\][\$\#] ')
1: TIMEOUT
100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 1/1 [00:26<00:00, 26.25s/it]
Exception ignored in: <function _ExecutorManagerThread.init.
run_as_devin=True,
Why it is true now?
Good question, I'm not sure, I've set it again in terminal to say run_as_devin=False, and the Config file says run_as_devin=False.
run_as_devin=False JessicaComputer:OpenDevin Jessica$ evaluation/swe_bench/scripts/run_infer.sh eval_gpt35_0125_preview CodeActAgent 1 AGENT: CodeActAgent AGENT_VERSION: v1.4 MODEL_CONFIG: eval_gpt35_0125_preview EVAL_LIMIT: 1 10:36:38 - opendevin.core.config:INFO: config.py:431 - Loading llm config from eval_gpt35_0125_preview 10:36:38 - opendevin.core.config:ERROR: config.py:438 - Config file not found: [Errno 2] No such file or directory: 'config.toml' 10:36:38 - opendevin:INFO: run_infer.py:330 - Config for evaluation: AppConfig(llm=LLMConfig(model='gpt-3.5-turbo', api_key='**', base_url=None, api_version=None, embedding_model='local', embedding_base_url=None, embedding_deployment_name=None, aws_access_key_id='**', aws_secret_access_key='**', aws_region_name=None, num_retries=5, retry_min_wait=3, retry_max_wait=60, timeout=None, max_chars=5000000, temperature=0, top_p=0.5, custom_llm_provider=None, max_input_tokens=None, max_output_tokens=None), agent=AgentConfig(name='CodeActAgent', memory_enabled=False, memory_max_threads=2), runtime='server', file_store='memory', file_store_path='/tmp/file_store', workspace_base='/Users/Jessica/Downloads/OpenDevin/workspace', workspace_mount_path='/Users/Jessica/Downloads/OpenDevin/workspace', workspace_mount_path_in_sandbox='/workspace', workspace_mount_rewrite=None, cache_dir='/tmp/cache', sandbox_container_image='ghcr.io/opendevin/sandbox:main', run_as_devin=True, max_iterations=100, e2b_api_key='**', sandbox_type='ssh', use_host_network=False, ssh_hostname='localhost', disable_color=False, sandbox_user_id=502, sandbox_timeout=120, github_token='**', jwt_secret='c2b8a6c3613e464b888e5e8f84cd92e0', debug=False, enable_auto_lint=False
I think it might not be finding the Config file and defaulting to default parameters.
set in caps for environmental variables: RUN_AS_DEVIN=true
10:36:38 - opendevin.core.config:ERROR: config.py:438 - Config file not found: [Errno 2] No such file or directory: 'config.toml'
Yes. Location of 'config.toml'?
Root of opendevin directory. I have had it inside evaluation/swe_bench previously and the outcome was the same.
export RUN_AS_DEVIN=true
Here is the error trace now, and the config file is in the OpenDevin/opendevin directory.
JessicaComputer:OpenDevin Jessica$ export RUN_AS_DEVIN=true JessicaComputer:OpenDevin Jessica$ evaluation/swe_bench/scripts/run_infer.sh eval_gpt35_0125_preview CodeActAgent 1 AGENT: CodeActAgent AGENT_VERSION: v1.4 MODEL_CONFIG: eval_gpt35_0125_preview EVAL_LIMIT: 1 11:00:37 - opendevin.core.config:INFO: config.py:431 - Loading llm config from eval_gpt35_0125_preview 11:00:37 - opendevin.core.config:ERROR: config.py:438 - Config file not found: [Errno 2] No such file or directory: 'config.toml' 11:00:37 - opendevin:INFO: run_infer.py:330 - Config for evaluation: AppConfig(llm=LLMConfig(model='gpt-3.5-turbo', api_key='**', base_url=None, api_version=None, embedding_model='local', embedding_base_url=None, embedding_deployment_name=None, aws_access_key_id='**', aws_secret_access_key='**', aws_region_name=None, num_retries=5, retry_min_wait=3, retry_max_wait=60, timeout=None, max_chars=5000000, temperature=0, top_p=0.5, custom_llm_provider=None, max_input_tokens=None, max_output_tokens=None), agent=AgentConfig(name='CodeActAgent', memory_enabled=False, memory_max_threads=2), runtime='server', file_store='memory', file_store_path='/tmp/file_store', workspace_base='/Users/Jessica/Downloads/OpenDevin/workspace', workspace_mount_path='/Users/Jessica/Downloads/OpenDevin/workspace', workspace_mount_path_in_sandbox='/workspace', workspace_mount_rewrite=None, cache_dir='/tmp/cache', sandbox_container_image='ghcr.io/opendevin/sandbox:main', run_as_devin=True, max_iterations=100, e2b_api_key='**', sandbox_type='ssh', use_host_network=False, ssh_hostname='localhost', disable_color=False, sandbox_user_id=502, sandbox_timeout=120, github_token='**', jwt_secret='1d769369774348a98a71cdbb82c403fd', debug=False, enable_auto_lint=False 11:00:37 - opendevin:INFO: run_infer.py:353 - Using evaluation output directory: evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo_maxiter_50_N_v1.4 11:00:37 - opendevin:INFO: run_infer.py:366 - Metadata: {'agent_class': 'CodeActAgent', 'model_name': 'gpt-3.5-turbo', 'max_iterations': 50, 'eval_output_dir': 'evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo_maxiter_50_N_v1.4', 'start_time': '2024-05-30 11:00:37', 'git_commit': '6ff50ed369163592041fdda5a7e9702ce79a17cc'} 11:00:37 - opendevin:INFO: run_infer.py:374 - Limiting evaluation to first 1 instances. 11:00:37 - opendevin:INFO: run_infer.py:378 - Writing evaluation output to evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo_maxiter_50_N_v1.4/output.jsonl 11:00:37 - opendevin:WARNING: run_infer.py:385 - Output file evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo_maxiter_50_N_v1.4/output.jsonl already exists. Loaded 0 finished instances. 11:00:37 - opendevin:INFO: run_infer.py:390 - Evaluation started with Agent CodeActAgent, model gpt-3.5-turbo, max iterations 50. 11:00:37 - opendevin:INFO: run_infer.py:406 - Finished instances: 0, Remaining instances: 1 0%| | 0/1 [00:00<?, ?it/s]11:00:37 - opendevin:INFO: run_infer.py:427 - Using 8 workers for evaluation. 11:00:37 - opendevin:INFO: run_infer.py:431 - Skipping workspace mount: True 11:00:52 - opendevin:INFO: run_infer.py:214 - Starting evaluation for instance django__django-15202. Hint: run "tail -f evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo_maxiter_50_N_v1.4/logs/instance_djangodjango-15202.log" to see live logs in a seperate shell 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 1/1 [00:20<00:00, 20.57s/it]ERROR:concurrent.futures:exception calling callback for <Future at 0x1347343d0 state=finished raised EOF> concurrent.futures.process._RemoteTraceback: """ Traceback (most recent call last): File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/process.py", line 261, in _process_worker r = call_item.fn(*call_item.args, **call_item.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/run_infer.py", line 234, in process_instance sandbox = SWEBenchSSHBox.get_box_for_instance( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/swe_env_box.py", line 96, in get_box_for_instance sandbox = cls( ^^^^ File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/swe_env_box.py", line 61, in init__ exit_code, output = self.execute('source /swe_util/swe_entry.sh', timeout=600) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/Jessica/Downloads/OpenDevin/opendevin/runtime/docker/ssh_box.py", line 440, in execute success = self.ssh.prompt(timeout=timeout) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/Jessica/Library/Caches/pypoetry/virtualenvs/opendevin-Ilj8wfey-py3.11/lib/python3.11/site-packages/pexpect/pxssh.py", line 506, in prompt i = self.expect([self.PROMPT, TIMEOUT], timeout=timeout) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/Jessica/Library/Caches/pypoetry/virtualenvs/opendevin-Ilj8wfey-py3.11/lib/python3.11/site-packages/pexpect/spawnbase.py", line 354, in expect return self.expect_list(compiled_pattern_list, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/Jessica/Library/Caches/pypoetry/virtualenvs/opendevin-Ilj8wfey-py3.11/lib/python3.11/site-packages/pexpect/spawnbase.py", line 383, in expect_list return exp.expect_loop(timeout) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/Jessica/Library/Caches/pypoetry/virtualenvs/opendevin-Ilj8wfey-py3.11/lib/python3.11/site-packages/pexpect/expect.py", line 179, in expect_loop return self.eof(e) ^^^^^^^^^^^ File "/Users/Jessica/Library/Caches/pypoetry/virtualenvs/opendevin-Ilj8wfey-py3.11/lib/python3.11/site-packages/pexpect/expect.py", line 122, in eof raise exc pexpect.exceptions.EOF: End Of File (EOF). Empty string style platform. <pexpect.pxssh.pxssh object at 0x130d3f710> command: /usr/bin/ssh args: [b'/usr/bin/ssh', b'-q', b'-p', b'62323', b'-l', b'opendevin', b'localhost'] buffer (last 100 chars): '' before (last 100 chars): "Error: This script is intended to be run by the 'root' user only.\r\n" after: <class 'pexpect.exceptions.EOF'> match: None match_index: None exitstatus: None flag_eof: True pid: 29667 child_fd: 35 closed: False timeout: 220 delimiter: <class 'pexpect.exceptions.EOF'> logfile: None logfile_read: None logfile_send: None maxread: 2000 ignorecase: False searchwindowsize: None delaybeforesend: 0.05 delayafterclose: 0.1 delayafterterminate: 0.1 searcher: searcher_re: 0: re.compile('\[PEXPECT\][\$\#] ') 1: TIMEOUT """
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 340, in _invoke_callbacks
callback(self)
File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/run_infer.py", line 416, in update_progress
output = future.result()
^^^^^^^^^^^^^^^
File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 449, in result
return self.get_result()
^^^^^^^^^^^^^^^^^^^
File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 401, in get_result
raise self._exception
pexpect.exceptions.EOF: End Of File (EOF). Empty string style platform.
<pexpect.pxssh.pxssh object at 0x130d3f710>
command: /usr/bin/ssh
args: [b'/usr/bin/ssh', b'-q', b'-p', b'62323', b'-l', b'opendevin', b'localhost']
buffer (last 100 chars): ''
before (last 100 chars): "Error: This script is intended to be run by the 'root' user only.\r\n"
after: <class 'pexpect.exceptions.EOF'>
match: None
match_index: None
exitstatus: None
flag_eof: True
pid: 29667
child_fd: 35
closed: False
timeout: 220
delimiter: <class 'pexpect.exceptions.EOF'>
logfile: None
logfile_read: None
logfile_send: None
maxread: 2000
ignorecase: False
searchwindowsize: None
delaybeforesend: 0.05
delayafterclose: 0.1
delayafterterminate: 0.1
searcher: searcher_re:
0: re.compile('\[PEXPECT\][\$\#] ')
1: TIMEOUT
ERROR:root: File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/run_infer.py", line 452, in
ERROR:root:<class 'pexpect.exceptions.EOF'>: End Of File (EOF). Empty string style platform.
<pexpect.pxssh.pxssh object at 0x130d3f710>
command: /usr/bin/ssh
args: [b'/usr/bin/ssh', b'-q', b'-p', b'62323', b'-l', b'opendevin', b'localhost']
buffer (last 100 chars): ''
before (last 100 chars): "Error: This script is intended to be run by the 'root' user only.\r\n"
after: <class 'pexpect.exceptions.EOF'>
match: None
match_index: None
exitstatus: None
flag_eof: True
pid: 29667
child_fd: 35
closed: False
timeout: 220
delimiter: <class 'pexpect.exceptions.EOF'>
logfile: None
logfile_read: None
logfile_send: None
maxread: 2000
ignorecase: False
searchwindowsize: None
delaybeforesend: 0.05
delayafterclose: 0.1
delayafterterminate: 0.1
searcher: searcher_re:
0: re.compile('\[PEXPECT\][\$\#] ')
1: TIMEOUT
100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 1/1 [00:23<00:00, 23.60s/it]
Exception ignored in: <function _ExecutorManagerThread.init.
Sorry export RUN_AS_DEVIN=false
Here is the error trace now... I still suspect not finding the config file might be contributing to the issue. Would love to know your thoughts though. Thanks!
JessicaComputer:OpenDevin Jessica$ export RUN_AS_DEVIN=false JessicaComputer:OpenDevin Jessica$ clear
JessicaComputer:OpenDevin Jessica$ export RUN_AS_DEVIN=false JessicaComputer:OpenDevin Jessica$ evaluation/swe_bench/scripts/run_infer.sh eval_gpt35_0125_preview CodeActAgent 1 AGENT: CodeActAgent AGENT_VERSION: v1.4 MODEL_CONFIG: eval_gpt35_0125_preview EVAL_LIMIT: 1 11:05:07 - opendevin.core.config:INFO: config.py:431 - Loading llm config from eval_gpt35_0125_preview 11:05:07 - opendevin.core.config:ERROR: config.py:438 - Config file not found: [Errno 2] No such file or directory: 'config.toml' 11:05:07 - opendevin:INFO: run_infer.py:330 - Config for evaluation: AppConfig(llm=LLMConfig(model='gpt-3.5-turbo', api_key='**', base_url=None, api_version=None, embedding_model='local', embedding_base_url=None, embedding_deployment_name=None, aws_access_key_id='**', aws_secret_access_key='**', aws_region_name=None, num_retries=5, retry_min_wait=3, retry_max_wait=60, timeout=None, max_chars=5000000, temperature=0, top_p=0.5, custom_llm_provider=None, max_input_tokens=None, max_output_tokens=None), agent=AgentConfig(name='CodeActAgent', memory_enabled=False, memory_max_threads=2), runtime='server', file_store='memory', file_store_path='/tmp/file_store', workspace_base='/Users/Jessica/Downloads/OpenDevin/workspace', workspace_mount_path='/Users/Jessica/Downloads/OpenDevin/workspace', workspace_mount_path_in_sandbox='/workspace', workspace_mount_rewrite=None, cache_dir='/tmp/cache', sandbox_container_image='ghcr.io/opendevin/sandbox:main', run_as_devin=False, max_iterations=100, e2b_api_key='**', sandbox_type='ssh', use_host_network=False, ssh_hostname='localhost', disable_color=False, sandbox_user_id=502, sandbox_timeout=120, github_token='**', jwt_secret='b066b8e5d1834fd0b448d6459f50efcf', debug=False, enable_auto_lint=False 11:05:07 - opendevin:INFO: run_infer.py:353 - Using evaluation output directory: evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo_maxiter_50_N_v1.4 11:05:07 - opendevin:INFO: run_infer.py:366 - Metadata: {'agent_class': 'CodeActAgent', 'model_name': 'gpt-3.5-turbo', 'max_iterations': 50, 'eval_output_dir': 'evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo_maxiter_50_N_v1.4', 'start_time': '2024-05-30 11:05:07', 'git_commit': '6ff50ed369163592041fdda5a7e9702ce79a17cc'} 11:05:07 - opendevin:INFO: run_infer.py:374 - Limiting evaluation to first 1 instances. 11:05:07 - opendevin:INFO: run_infer.py:378 - Writing evaluation output to evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo_maxiter_50_N_v1.4/output.jsonl 11:05:07 - opendevin:WARNING: run_infer.py:385 - Output file evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo_maxiter_50_N_v1.4/output.jsonl already exists. Loaded 0 finished instances. 11:05:07 - opendevin:INFO: run_infer.py:390 - Evaluation started with Agent CodeActAgent, model gpt-3.5-turbo, max iterations 50. 11:05:07 - opendevin:INFO: run_infer.py:406 - Finished instances: 0, Remaining instances: 1 0%| | 0/1 [00:00<?, ?it/s]11:05:07 - opendevin:INFO: run_infer.py:427 - Using 8 workers for evaluation. 11:05:07 - opendevin:INFO: run_infer.py:431 - Skipping workspace mount: True 11:05:23 - opendevin:INFO: run_infer.py:214 - Starting evaluation for instance django__django-15202. Hint: run "tail -f evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo_maxiter_50_N_v1.4/logs/instance_django__django-15202.log" to see live logs in a seperate shell 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 1/1 [00:43<00:00, 43.67s/it]ERROR:concurrent.futures:exception calling callback for <Future at 0x12fef8690 state=finished raised ValueError> concurrent.futures.process._RemoteTraceback: """ Traceback (most recent call last): File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/process.py", line 261, in _process_worker r = call_item.fn(*call_item.args, **call_item.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/run_infer.py", line 259, in process_instance state: State = asyncio.run( ^^^^^^^^^^^^ File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/runners.py", line 190, in run return runner.run(main) ^^^^^^^^^^^^^^^^ File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/runners.py", line 118, in run return self._loop.run_until_complete(task) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete return future.result() ^^^^^^^^^^^^^^^ File "/Users/Jessica/Downloads/OpenDevin/opendevin/core/main.py", line 67, in main raise ValueError(f'Invalid toml file, cannot read {args.llm_config}') ValueError: Invalid toml file, cannot read eval_gpt35_0125_preview """
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 340, in _invoke_callbacks
callback(self)
File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/run_infer.py", line 416, in update_progress
output = future.result()
^^^^^^^^^^^^^^^
File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 449, in result
return self.get_result()
^^^^^^^^^^^^^^^^^^^
File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
ValueError: Invalid toml file, cannot read eval_gpt35_0125_preview
ERROR:root: File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/run_infer.py", line 452, in
ERROR:root:<class 'ValueError'>: Invalid toml file, cannot read eval_gpt35_0125_preview
100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 1/1 [00:47<00:00, 47.41s/it]
Exception ignored in: <function _ExecutorManagerThread.init.
run ls config.toml
Thank you, was able to fix the issue of config being in the wrong directory, now am getting this error trace.
JessicaComputer:OpenDevin Jessica$ evaluation/swe_bench/scripts/run_infer.sh eval_gpt35_0125_preview CodeActAgent 1 AGENT: CodeActAgent AGENT_VERSION: v1.4 MODEL_CONFIG: eval_gpt35_0125_preview EVAL_LIMIT: 1 11:19:15 - opendevin.core.config:INFO: config.py:431 - Loading llm config from eval_gpt35_0125_preview 11:19:15 - opendevin:INFO: run_infer.py:330 - Config for evaluation: AppConfig(llm=LLMConfig(model='gpt-3.5-turbo-0125', api_key='**', base_url=None, api_version=None, embedding_model='local', embedding_base_url=None, embedding_deployment_name=None, aws_access_key_id='**', aws_secret_access_key='**', aws_region_name=None, num_retries=5, retry_min_wait=3, retry_max_wait=60, timeout=None, max_chars=5000000, temperature=0.0, top_p=0.5, custom_llm_provider=None, max_input_tokens=None, max_output_tokens=None), agent=AgentConfig(name='CodeActAgent', memory_enabled=False, memory_max_threads=2), runtime='server', file_store='memory', file_store_path='/tmp/file_store', workspace_base='/Users/Jessica/Downloads/OpenDevin/workspace', workspace_mount_path='/Users/Jessica/Downloads/OpenDevin/workspace', workspace_mount_path_in_sandbox='/workspace', workspace_mount_rewrite=None, cache_dir='/tmp/cache', sandbox_container_image='ghcr.io/opendevin/sandbox:latest', run_as_devin=False, max_iterations=100, e2b_api_key='**', sandbox_type='ssh', use_host_network=False, ssh_hostname='localhost', disable_color=False, sandbox_user_id=502, sandbox_timeout=120, github_token='**', jwt_secret='cf972a2d8a474e0a8d631f37103423a2', debug=False, enable_auto_lint=True 11:19:15 - opendevin:INFO: run_infer.py:353 - Using evaluation output directory: evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo-0125_maxiter_50_N_v1.4 11:19:15 - opendevin:INFO: run_infer.py:366 - Metadata: {'agent_class': 'CodeActAgent', 'model_name': 'gpt-3.5-turbo-0125', 'max_iterations': 50, 'eval_output_dir': 'evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo-0125_maxiter_50_N_v1.4', 'start_time': '2024-05-30 11:19:15', 'git_commit': '6ff50ed369163592041fdda5a7e9702ce79a17cc'} 11:19:15 - opendevin:INFO: run_infer.py:374 - Limiting evaluation to first 1 instances. 11:19:15 - opendevin:INFO: run_infer.py:378 - Writing evaluation output to evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo-0125_maxiter_50_N_v1.4/output.jsonl 11:19:15 - opendevin:INFO: run_infer.py:390 - Evaluation started with Agent CodeActAgent, model gpt-3.5-turbo-0125, max iterations 50. 11:19:15 - opendevin:INFO: run_infer.py:406 - Finished instances: 0, Remaining instances: 1 0%| | 0/1 [00:00<?, ?it/s]11:19:15 - opendevin:INFO: run_infer.py:427 - Using 8 workers for evaluation. 11:19:15 - opendevin:INFO: run_infer.py:431 - Skipping workspace mount: True 11:19:34 - opendevin:INFO: run_infer.py:214 - Starting evaluation for instance djangodjango-15202. Hint: run "tail -f evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo-0125_maxiter_50_N_v1.4/logs/instance_django__django-15202.log" to see live logs in a seperate shell 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 1/1 [00:57<00:00, 57.56s/it]ERROR:concurrent.futures:exception calling callback for <Future at 0x12e32fd10 state=finished raised BrowserException> concurrent.futures.process._RemoteTraceback: """ Traceback (most recent call last): File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/process.py", line 261, in _process_worker r = call_item.fn(*call_item.args, **call_item.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/run_infer.py", line 259, in process_instance state: State = asyncio.run( ^^^^^^^^^^^^ File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/runners.py", line 190, in run return runner.run(main) ^^^^^^^^^^^^^^^^ File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/runners.py", line 118, in run return self._loop.run_until_complete(task) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete return future.result() ^^^^^^^^^^^^^^^ File "/Users/Jessica/Downloads/OpenDevin/opendevin/core/main.py", line 92, in main runtime = ServerRuntime(event_stream=event_stream, sandbox=sandbox) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/Jessica/Downloads/OpenDevin/opendevin/runtime/server/runtime.py", line 35, in init super().init(event_stream, sid, sandbox) File "/Users/Jessica/Downloads/OpenDevin/opendevin/runtime/runtime.py", line 74, in init self.browser = BrowserEnv() ^^^^^^^^^^^^ File "/Users/Jessica/Downloads/OpenDevin/opendevin/runtime/browser/browser_env.py", line 41, in init__ raise BrowserException('Failed to start browser environment.') opendevin.runtime.browser.browser_env.BrowserException: Failed to start browser environment. """
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 340, in _invoke_callbacks
callback(self)
File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/run_infer.py", line 416, in update_progress
output = future.result()
^^^^^^^^^^^^^^^
File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 449, in result
return self.get_result()
^^^^^^^^^^^^^^^^^^^
File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 401, in get_result
raise self._exception
File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/run_infer.py", line 452, in
ERROR:root:<class 'opendevin.runtime.browser.browser_env.BrowserException'>: Failed to start browser environment.
100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 1/1 [01:13<00:00, 73.44s/it]
Exception ignored in: <function _ExecutorManagerThread.init.
Please run the following command to download new browsers: β β β β playwright install β
Thank you, I have attempted to install it with these instructions, https://playwright.dev/docs/intro#installing-playwright, and it appears to have been installed correctly. But, when I run this command, evaluation/swe_bench/scripts/run_infer.sh eval_gpt35_0125_preview CodeActAgent 1, I still get the same exact error trace as above with the instruction to install playwright.
Could you reopen the terminal and run again?
Thank you I think this helped!!
I am now getting this problem:
JessicaComputer:OpenDevin Jessica$ evaluation/swe_bench/scripts/run_infer.sh eval_gpt35_0125_preview CodeActAgent 1
AGENT: CodeActAgent
AGENT_VERSION: v1.4
MODEL_CONFIG: eval_gpt35_0125_preview
EVAL_LIMIT: 1
Traceback (most recent call last):
File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/run_infer.py", line 17, in
Do you think the environment set up is the issue? I feel the issue might be just how the files are structured / it not seeing this module.
Yes, because the package has not been installed, and the script runs through poetry. You need to run poetry install
or maybe set PYTHONPATH=`pwd`:$PYTHONPATH
and check
The same structure works for me.
Got it, will do. Thank you. Do you have a suggestion for how to resolve this? I believe all my versions that are described at the top of the page are correct. Thank you.
JessicaComputer:OpenDevin Jessica$ make build Building project... Checking dependencies... Checking system... macOS detected. Checking Python installation... Python 3.11.7 is already installed. Checking npm installation... npm 10.7.0 is already installed. Checking Node.js installation... Node.js 22.2.0 is already installed. Checking Docker installation... Docker version 26.1.1, build 4cf5afa is already installed. Checking Poetry installation... Poetry (version 1.8.3) is already installed. Dependencies checked successfully. Pulling Docker image... Using default tag: latest latest: Pulling from opendevin/sandbox Digest: sha256:4bd05c581692e26a448bbc6771a21bb27002cb0e6bcf5034d0db91bb8704d0f0 Status: Image is up to date for ghcr.io/opendevin/sandbox:latest ghcr.io/opendevin/sandbox:latest
What's Next? View a summary of image vulnerabilities and recommendations β docker scout quickview ghcr.io/opendevin/sandbox Docker image pulled successfully. Installing Python dependencies... /bin/bash: chroma-hnswlib: command not found Installing ... Requirement already satisfied: chroma-hnswlib in /Users/Jessica/Library/Caches/pypoetry/virtualenvs/opendevin-Ilj8wfey-py3.11/lib/python3.11/site-packages (0.7.3) Requirement already satisfied: numpy in /Users/Jessica/Library/Caches/pypoetry/virtualenvs/opendevin-Ilj8wfey-py3.11/lib/python3.11/site-packages (from chroma-hnswlib) (1.26.4) Installing dependencies from lock file
No dependencies to install or update
Installing the current project: opendevin (0.1.0) [Errno 13] Permission denied: '/Users/Jessica/Library/Caches/pypoetry/virtualenvs/opendevin-Ilj8wfey-py3.11/lib/python3.11/site-packages/opendevin-0.1.0.dist-info/METADATA' make[1]: [install-python-dependencies] Error 1 make: [build] Error 2 JessicaComputer:OpenDevin Jessica$
[Errno 13] Permission denied: '/Users/Jessica/Library/Caches/pypoetry/virtualenvs/opendevin-Ilj8wfey-py3.11/lib/python3.11/site-packages/opendevin-0.1.0.dist-info/METADATA'
Verify the permissions of the file:
ls -l /Users/Jessica/Library/Caches/pypoetry/virtualenvs/opendevin-Ilj8wfey-py3.11/lib/python3.11/site-packages/opendevin-0.1.0.dist-info/METADATA1
Ensure no processes are locking the file:
lsof /Users/Jessica/Library/Caches/pypoetry/virtualenvs/opendevin-Ilj8wfey-py3.11/lib/python3.11/site-packages/opendevin-0.1.0.dist-info/METADATA1
[Errno 13] Permission denied: '/Users/Jessica/Library/Caches/pypoetry/virtualenvs/opendevin-Ilj8wfey-py3.11/lib/python3.11/site-packages/opendevin-0.1.0.dist-info/METADATA'
Verify the permissions of the file:
ls -l /Users/Jessica/Library/Caches/pypoetry/virtualenvs/opendevin-Ilj8wfey-py3.11/lib/python3.11/site-packages/opendevin-0.1.0.dist-info/METADATA1
Ensure no processes are locking the file:
lsof /Users/Jessica/Library/Caches/pypoetry/virtualenvs/opendevin-Ilj8wfey-py3.11/lib/python3.11/site-packages/opendevin-0.1.0.dist-info/METADATA1
Hi, thanks! The outputs are:
JessicaComputer:OpenDevin Jessica$ ls -l /Users/Jessica/Library/Caches/pypoetry/virtualenvs/opendevin-Ilj8wfey-py3.11/lib/python3.11/site-packages/opendevin-0.1.0.dist-info/METADATA -rw-r--r-- 1 root staff 8195 May 30 13:49 /Users/Jessica/Library/Caches/pypoetry/virtualenvs/opendevin-Ilj8wfey-py3.11/lib/python3.11/site-packages/opendevin-0.1.0.dist-info/METADATA
and
JessicaComputer:OpenDevin Jessica$ lsof /Users/Jessica/Library/Caches/pypoetry/virtualenvs/opendevin-Ilj8wfey-py3.11/lib/python3.11/site-packages/opendevin-0.1.0.dist-info/METADATA JessicaComputer:OpenDevin Jessica$
What do you suggest as next steps to getting this working?
The file is owned by root with permissions set to rw-r--r--, meaning only the root user has write permissions. No processes are locking the file, so the issue is purely permission-related.
Change the ownership of the file so that the current user (Jessica) can access it.
sudo chown jessica /Users/Jessica/Library/Caches/pypoetry/virtualenvs/opendevin-Ilj8wfey-py3.11/lib/python3.11/site-packages/opendevin-0.1.0.dist-info/METADATA
Thank you! Have done this, and am now back at this problem:
JessicaComputer:OpenDevin Jessica$ sudo chown jessica /Users/Jessica/Library/Caches/pypoetry/virtualenvs/opendevin-Ilj8wfey-py3.11/lib/python3.11/site-packages/opendevin-0.1.0.dist-info/METADATA
Password:
JessicaComputer:OpenDevin Jessica$ evaluation/swe_bench/scripts/run_infer.sh eval_gpt35_0125_preview CodeActAgent 1
AGENT: CodeActAgent
AGENT_VERSION: v1.4
MODEL_CONFIG: eval_gpt35_0125_preview
EVAL_LIMIT: 1
Traceback (most recent call last):
File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/run_infer.py", line 17, in
During setup I had the following issue:
JessicaComputer:OpenDevin Jessica$ make run Running the app... Starting backend server... Waiting for the backend to start... INFO: Started server process [87671] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Uvicorn running on http://127.0.0.1:3000 (Press CTRL+C to quit) Connection to localhost port 3000 [tcp/hbci] succeeded! Backend started successfully. Starting frontend with npm...
opendevin-frontend@0.1.0 start npm run make-i18n && vite --port 3001
opendevin-frontend@0.1.0 make-i18n node scripts/make-i18n-translations.cjs
node:internal/fs/rimraf:202 throw err; ^
Error: EACCES: permission denied, rmdir '/Users/Jessica/Downloads/OpenDevin/frontend/public/locales/ar'
at rmdirSync (node:fs:1217:11)
at _rmdirSync (node:internal/fs/rimraf:235:5)
at rimrafSync (node:internal/fs/rimraf:193:7)
at node:internal/fs/rimraf:253:9
at Array.forEach (
Node.js v22.2.0 make: *** [run] Error 1 JessicaComputer:OpenDevin Jessica$ sudo chown jessica /Users/Jessica/Downloads/OpenDevin/frontend/public/locales/ar JessicaComputer:OpenDevin Jessica$ sudo chown jessica /Users/Jessica/Downloads/OpenDevin/frontend/public/locales/ar JessicaComputer:OpenDevin Jessica$ make run Running the app... Starting backend server... Waiting for the backend to start... Connection to localhost port 3000 [tcp/hbci] succeeded! Backend started successfully. Starting frontend with npm...
opendevin-frontend@0.1.0 start npm run make-i18n && vite --port 3001
opendevin-frontend@0.1.0 make-i18n node scripts/make-i18n-translations.cjs
node:internal/fs/rimraf:202 throw err; ^
Error: EACCES: permission denied, rmdir '/Users/Jessica/Downloads/OpenDevin/frontend/public/locales/ar'
at rmdirSync (node:fs:1217:11)
at _rmdirSync (node:internal/fs/rimraf:235:5)
at rimrafSync (node:internal/fs/rimraf:193:7)
at node:internal/fs/rimraf:253:9
at Array.forEach (
Node.js v22.2.0 make: *** [run] Error 1 JessicaComputer:OpenDevin Jessica$ INFO: Started server process [88162] INFO: Waiting for application startup. INFO: Application startup complete. ERROR: [Errno 48] error while attempting to bind on address ('127.0.0.1', 3000): address already in use INFO: Waiting for application shutdown. INFO: Application shutdown complete.
ERROR: [Errno 48] error while attempting to bind on address ('127.0.0.1', 3000): address already in use
To resolve this port issue, run sudo kill -9 $(sudo lsof -t -i:3000)
r '/Users/Jessica/Downloads/OpenDevin/frontend/public/locales/ar'
Change permission for this too.
JessicaComputer:OpenDevin Jessica$ sudo chown jessica /Users/Jessica/Downloads/OpenDevin/frontend/public/locales/ar JessicaComputer:OpenDevin Jessica$ make run Running the app... Starting backend server... Waiting for the backend to start... INFO: Started server process [91182] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Uvicorn running on http://127.0.0.1:3000 (Press CTRL+C to quit) Connection to localhost port 3000 [tcp/hbci] succeeded! Backend started successfully. Starting frontend with npm...
opendevin-frontend@0.1.0 start npm run make-i18n && vite --port 3001
opendevin-frontend@0.1.0 make-i18n node scripts/make-i18n-translations.cjs
node:internal/fs/rimraf:202 throw err; ^
Error: EACCES: permission denied, rmdir '/Users/Jessica/Downloads/OpenDevin/frontend/public/locales/ar'
at rmdirSync (node:fs:1217:11)
at _rmdirSync (node:internal/fs/rimraf:235:5)
at rimrafSync (node:internal/fs/rimraf:193:7)
at node:internal/fs/rimraf:253:9
at Array.forEach (
Node.js v22.2.0 make: *** [run] Error 1
sudo chown -R jessica:staff /Users/Jessica/Downloads/OpenDevin/frontend/public/locales/ar
It will change the ownership of its contents recursively:
JessicaComputer:OpenDevin Jessica$ sudo chown -R jessica:staff /Users/Jessica/Downloads/OpenDevin/frontend/public/locales/ar JessicaComputer:OpenDevin Jessica$ make run Running the app... Starting backend server... Waiting for the backend to start... Connection to localhost port 3000 [tcp/hbci] succeeded! Backend started successfully. Starting frontend with npm...
opendevin-frontend@0.1.0 start npm run make-i18n && vite --port 3001
opendevin-frontend@0.1.0 make-i18n node scripts/make-i18n-translations.cjs
node:internal/fs/rimraf:202 throw err; ^
Error: EACCES: permission denied, rmdir '/Users/Jessica/Downloads/OpenDevin/frontend/public/locales/ar'
at rmdirSync (node:fs:1217:11)
at _rmdirSync (node:internal/fs/rimraf:235:5)
at rimrafSync (node:internal/fs/rimraf:193:7)
at node:internal/fs/rimraf:253:9
at Array.forEach (
Node.js v22.2.0 make: *** [run] Error 1 JessicaComputer:OpenDevin Jessica$ INFO: Started server process [92156] INFO: Waiting for application startup. INFO: Application startup complete. ERROR: [Errno 48] error while attempting to bind on address ('127.0.0.1', 3000): address already in use INFO: Waiting for application shutdown. INFO: Application shutdown complete. sudo kill -9 $(sudo lsof -t -i:3000) JessicaComputer:OpenDevin Jessica$
To change permissions
sudo chmod -R u+rwx /Users/Jessica/Downloads/OpenDevin/frontend/public/locales/ar
and run make start-frontend
JessicaComputer:OpenDevin Jessica$ sudo chown -R jessica:staff /Users/Jessica/Downloads/OpenDevin/frontend/public/locales/ar JessicaComputer:OpenDevin Jessica$ make run Running the app... Starting backend server... Waiting for the backend to start... Connection to localhost port 3000 [tcp/hbci] succeeded! Backend started successfully. Starting frontend with npm...
opendevin-frontend@0.1.0 start npm run make-i18n && vite --port 3001
opendevin-frontend@0.1.0 make-i18n node scripts/make-i18n-translations.cjs
node:internal/fs/rimraf:202 throw err; ^
Error: EACCES: permission denied, rmdir '/Users/Jessica/Downloads/OpenDevin/frontend/public/locales/ar'
at rmdirSync (node:fs:1217:11)
at _rmdirSync (node:internal/fs/rimraf:235:5)
at rimrafSync (node:internal/fs/rimraf:193:7)
at node:internal/fs/rimraf:253:9
at Array.forEach (
Node.js v22.2.0 make: *** [run] Error 1 JessicaComputer:OpenDevin Jessica$ INFO: Started server process [93816] INFO: Waiting for application startup. INFO: Application startup complete. ERROR: [Errno 48] error while attempting to bind on address ('127.0.0.1', 3000): address already in use INFO: Waiting for application shutdown. INFO: Application shutdown complete.
JessicaComputer:OpenDevin Jessica$ sudo chmod -R u+rwx /Users/Jessica/Downloads/OpenDevin/frontend/public/locales/ar Password: JessicaComputer:OpenDevin Jessica$ make start-frontend Starting frontend...
opendevin-frontend@0.1.0 start npm run make-i18n && vite
opendevin-frontend@0.1.0 make-i18n node scripts/make-i18n-translations.cjs
node:internal/fs/rimraf:202 throw err; ^
Error: EACCES: permission denied, rmdir '/Users/Jessica/Downloads/OpenDevin/frontend/public/locales/ar'
at rmdirSync (node:fs:1217:11)
at _rmdirSync (node:internal/fs/rimraf:235:5)
at rimrafSync (node:internal/fs/rimraf:193:7)
at node:internal/fs/rimraf:253:9
at Array.forEach (
Node.js v22.2.0 make: *** [start-frontend] Error 1 JessicaComputer:OpenDevin Jessica$
To check the ownership and permissions:
ls -lR /Users/Jessica/Downloads/OpenDevin/frontend/public/locales/ar
Thanks! It says this -- Jessicas-Computer:opendevin Jessica$ ls -lR /Users/Jessica/Downloads/OpenDevin/frontend/public/locales/ar total 8 -rwxr--r-- 1 Jessica staff 1644 May 30 13:52 translation.json
Check if any process is locking the file by running lsof {path}
Here's the output, let me know if I'm not running it correctly:
Jessicas-Computer:opendevin Jessica$ lsof /Users/Jessica/Downloads/OpenDevin/frontend/public/locales/ar Jessicas-Computer:opendevin Jessica$
Could you check make start-frontend
again? Also, the backend is installed, check your previous issue https://github.com/OpenDevin/OpenDevin/issues/2140#issuecomment-2139354503
I TRIED TO RUN THE COMMAND AGAIN FOR RUNNING EVAL AND AM NOW EXPERIENCING THIS ISSUE:
Jessicas-Computer:opendevin Jessica$ sudo evaluation/swe_bench/scripts/run_infer.sh eval_gpt35_0125_preview CodeActAgent 1 AGENT: CodeActAgent AGENT_VERSION: v1.4 MODEL_CONFIG: eval_gpt35_0125_preview EVAL_LIMIT: 1 00:09:45 - opendevin.core.config:INFO: config.py:431 - Loading llm config from eval_gpt35_0125_preview 00:09:45 - opendevin:INFO: run_infer.py:330 - Config for evaluation: AppConfig(llm=LLMConfig(model='gpt-3.5-turbo-0125', api_key='**', base_url=None, api_version=None, embedding_model='openai', embedding_base_url=None, embedding_deployment_name=None, aws_access_key_id='**', aws_secret_access_key='**', aws_region_name=None, num_retries=5, retry_min_wait=3, retry_max_wait=60, timeout=None, max_chars=5000000, temperature=0, top_p=0.5, custom_llm_provider=None, max_input_tokens=None, max_output_tokens=None), agent=AgentConfig(name='CodeActAgent', memory_enabled=False, memory_max_threads=2), runtime='server', file_store='memory', file_store_path='/tmp/file_store', workspace_base='/Users/Jessica/Downloads/OpenDevin/OpenDevin', workspace_mount_path='/Users/Jessica/Downloads/OpenDevin/OpenDevin', workspace_mount_path_in_sandbox='/workspace', workspace_mount_rewrite=None, cache_dir='/tmp/cache', sandbox_container_image='ghcr.io/opendevin/sandbox:main', run_as_devin=True, max_iterations=100, e2b_api_key='**', sandbox_type='ssh', use_host_network=False, ssh_hostname='localhost', disable_color=False, sandbox_user_id=0, sandbox_timeout=120, github_token='**', jwt_secret='15f96ff117dd42fbb90014fc18b779fd', debug=False, enable_auto_lint=False 00:09:45 - opendevin:INFO: run_infer.py:353 - Using evaluation output directory: evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo-0125_maxiter_50_N_v1.4 00:09:45 - opendevin:INFO: run_infer.py:366 - Metadata: {'agent_class': 'CodeActAgent', 'model_name': 'gpt-3.5-turbo-0125', 'max_iterations': 50, 'eval_output_dir': 'evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo-0125_maxiter_50_N_v1.4', 'start_time': '2024-06-02 00:09:45', 'git_commit': '6ff50ed369163592041fdda5a7e9702ce79a17cc'} 00:09:45 - opendevin:INFO: run_infer.py:374 - Limiting evaluation to first 1 instances. 00:09:45 - opendevin:INFO: run_infer.py:378 - Writing evaluation output to evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo-0125_maxiter_50_N_v1.4/output.jsonl 00:09:45 - opendevin:WARNING: run_infer.py:385 - Output file evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo-0125_maxiter_50_N_v1.4/output.jsonl already exists. Loaded 0 finished instances. 00:09:45 - opendevin:INFO: run_infer.py:390 - Evaluation started with Agent CodeActAgent, model gpt-3.5-turbo-0125, max iterations 50. 00:09:45 - opendevin:INFO: run_infer.py:406 - Finished instances: 0, Remaining instances: 1 0%| | 0/1 [00:00<?, ?it/s]00:09:45 - opendevin:INFO: run_infer.py:427 - Using 8 workers for evaluation. 00:09:45 - opendevin:INFO: run_infer.py:431 - Skipping workspace mount: True 00:09:57 - opendevin:INFO: run_infer.py:214 - Starting evaluation for instance django__django-15202. Hint: run "tail -f evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo-0125_maxiter_50_N_v1.4/logs/instance_djangodjango-15202.log" to see live logs in a seperate shell 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 1/1 [00:13<00:00, 13.53s/it]ERROR:concurrent.futures:exception calling callback for <Future at 0x138c43250 state=finished raised Exception> concurrent.futures.process._RemoteTraceback: """ Traceback (most recent call last): File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/process.py", line 261, in _process_worker r = call_item.fn(*call_item.args, **call_item.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/run_infer.py", line 234, in process_instance sandbox = SWEBenchSSHBox.get_box_for_instance( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/swe_env_box.py", line 96, in get_box_for_instance sandbox = cls( ^^^^ File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/swe_env_box.py", line 41, in init super().init(container_image, timeout, sid) File "/Users/Jessica/Downloads/OpenDevin/opendevin/runtime/docker/ssh_box.py", line 255, in init__ self.setup_user() File "/Users/Jessica/Downloads/OpenDevin/opendevin/runtime/docker/ssh_box.py", line 315, in setup_user raise Exception(f'Failed to create opendevin user in sandbox: {logs}') Exception: Failed to create opendevin user in sandbox: b'useradd: UID 0 is not unique\n' """
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 340, in _invoke_callbacks
callback(self)
File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/run_infer.py", line 416, in update_progress
output = future.result()
^^^^^^^^^^^^^^^
File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 449, in result
return self.get_result()
^^^^^^^^^^^^^^^^^^^
File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 401, in get_result
raise self._exception
Exception: Failed to create opendevin user in sandbox: b'useradd: UID 0 is not unique\n'
ERROR:root: File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/run_infer.py", line 452, in
ERROR:root:<class 'Exception'>: Failed to create opendevin user in sandbox: b'useradd: UID 0 is not unique\n'
100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 1/1 [00:16<00:00, 16.20s/it]
Exception ignored in: <function _ExecutorManagerThread.init.
WHEN I RUN THE FRONTEND AND CLICK THE http://localhost:3001/ I GET THIS:
WHAT WOULD YOU SUGGEST TO GET THIS WORKING?
@JessChud please note, in the last log:
run_as_devin=True
Can you set this to false? There's a comment above about it, export it or set in config.toml. As far as I know, it needs to be false (so it runs as root) for evals.
Tried to but for some reason isn't updating...
Jessicas-Computer:opendevin Jessica$ export RUN_AS_DEVIN=false Jessicas-Computer:opendevin Jessica$ sudo evaluation/swe_bench/scripts/run_infer.sh eval_gpt35_0125_preview CodeActAgent 1 AGENT: CodeActAgent AGENT_VERSION: v1.4 MODEL_CONFIG: eval_gpt35_0125_preview EVAL_LIMIT: 1 01:09:41 - opendevin.core.config:INFO: config.py:431 - Loading llm config from eval_gpt35_0125_preview 01:09:41 - opendevin:INFO: run_infer.py:330 - Config for evaluation: AppConfig(llm=LLMConfig(model='gpt-3.5-turbo-0125', api_key='**', base_url=None, api_version=None, embedding_model='openai', embedding_base_url=None, embedding_deployment_name=None, aws_access_key_id='**', aws_secret_access_key='**', aws_region_name=None, num_retries=5, retry_min_wait=3, retry_max_wait=60, timeout=None, max_chars=5000000, temperature=0, top_p=0.5, custom_llm_provider=None, max_input_tokens=None, max_output_tokens=None), agent=AgentConfig(name='CodeActAgent', memory_enabled=False, memory_max_threads=2), runtime='server', file_store='memory', file_store_path='/tmp/file_store', workspace_base='/Users/Jessica/Downloads/OpenDevin/opendevin', workspace_mount_path='/Users/Jessica/Downloads/OpenDevin/opendevin', workspace_mount_path_in_sandbox='/workspace', workspace_mount_rewrite=None, cache_dir='/tmp/cache', sandbox_container_image='ghcr.io/opendevin/sandbox:main', run_as_devin=True, max_iterations=100, e2b_api_key='**', sandbox_type='ssh', use_host_network=False, ssh_hostname='localhost', disable_color=False, sandbox_user_id=0, sandbox_timeout=120, github_token='**', jwt_secret='61c3514eff094811b7d26c3f3c3b25df', debug=False, enable_auto_lint=False 01:09:41 - opendevin:INFO: run_infer.py:353 - Using evaluation output directory: evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo-0125_maxiter_50_N_v1.4 01:09:41 - opendevin:INFO: run_infer.py:366 - Metadata: {'agent_class': 'CodeActAgent', 'model_name': 'gpt-3.5-turbo-0125', 'max_iterations': 50, 'eval_output_dir': 'evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo-0125_maxiter_50_N_v1.4', 'start_time': '2024-06-02 01:09:41', 'git_commit': '6ff50ed369163592041fdda5a7e9702ce79a17cc'} 01:09:41 - opendevin:INFO: run_infer.py:374 - Limiting evaluation to first 1 instances. 01:09:41 - opendevin:INFO: run_infer.py:378 - Writing evaluation output to evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo-0125_maxiter_50_N_v1.4/output.jsonl 01:09:41 - opendevin:WARNING: run_infer.py:385 - Output file evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo-0125_maxiter_50_N_v1.4/output.jsonl already exists. Loaded 0 finished instances. 01:09:41 - opendevin:INFO: run_infer.py:390 - Evaluation started with Agent CodeActAgent, model gpt-3.5-turbo-0125, max iterations 50. 01:09:41 - opendevin:INFO: run_infer.py:406 - Finished instances: 0, Remaining instances: 1 0%| | 0/1 [00:00<?, ?it/s]01:09:41 - opendevin:INFO: run_infer.py:427 - Using 8 workers for evaluation. 01:09:41 - opendevin:INFO: run_infer.py:431 - Skipping workspace mount: True 01:09:55 - opendevin:INFO: run_infer.py:214 - Starting evaluation for instance django__django-15202. Hint: run "tail -f evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo-0125_maxiter_50_N_v1.4/logs/instance_djangodjango-15202.log" to see live logs in a seperate shell 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 1/1 [00:16<00:00, 16.69s/it]ERROR:concurrent.futures:exception calling callback for <Future at 0x1360c54d0 state=finished raised Exception> concurrent.futures.process._RemoteTraceback: """ Traceback (most recent call last): File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/process.py", line 261, in _process_worker r = call_item.fn(*call_item.args, **call_item.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/run_infer.py", line 234, in process_instance sandbox = SWEBenchSSHBox.get_box_for_instance( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/swe_env_box.py", line 96, in get_box_for_instance sandbox = cls( ^^^^ File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/swe_env_box.py", line 41, in init super().init(container_image, timeout, sid) File "/Users/Jessica/Downloads/OpenDevin/opendevin/runtime/docker/ssh_box.py", line 255, in init__ self.setup_user() File "/Users/Jessica/Downloads/OpenDevin/opendevin/runtime/docker/ssh_box.py", line 315, in setup_user raise Exception(f'Failed to create opendevin user in sandbox: {logs}') Exception: Failed to create opendevin user in sandbox: b'useradd: UID 0 is not unique\n' """
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 340, in _invoke_callbacks
callback(self)
File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/run_infer.py", line 416, in update_progress
output = future.result()
^^^^^^^^^^^^^^^
File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 449, in result
return self.get_result()
^^^^^^^^^^^^^^^^^^^
File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 401, in get_result
raise self._exception
File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/run_infer.py", line 452, in
ERROR:root:<class 'Exception'>: Failed to create opendevin user in sandbox: b'useradd: UID 0 is not unique\n'
100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 1/1 [00:20<00:00, 20.52s/it]
Exception ignored in: <function _ExecutorManagerThread.init.
Is it in config.toml? Although... it is a bit odd
Second thing, @JessChud please take a look at this too: in config.toml, make sure to set persist_sandbox=false
[core]
...
persist_sandbox=false
run_as_devin=false
...
Thank you so much! Now i'm experiencing this problem:
Jessicas-Computer:opendevin Jessica$ sudo evaluation/swe_bench/scripts/run_infer.sh eval_gpt35_0125_preview CodeActAgent 1 AGENT: CodeActAgent AGENT_VERSION: v1.4 MODEL_CONFIG: eval_gpt35_0125_preview EVAL_LIMIT: 1 01:25:00 - opendevin.core.config:INFO: config.py:431 - Loading llm config from eval_gpt35_0125_preview 01:25:00 - opendevin:INFO: run_infer.py:330 - Config for evaluation: AppConfig(llm=LLMConfig(model='gpt-3.5-turbo-0125', api_key='**', base_url=None, api_version=None, embedding_model='local', embedding_base_url=None, embedding_deployment_name=None, aws_access_key_id='**', aws_secret_access_key='**', aws_region_name=None, num_retries=5, retry_min_wait=3, retry_max_wait=60, timeout=None, max_chars=5000000, temperature=0.0, top_p=0.5, custom_llm_provider=None, max_input_tokens=None, max_output_tokens=None), agent=AgentConfig(name='CodeActAgent', memory_enabled=False, memory_max_threads=2), runtime='server', file_store='memory', file_store_path='/tmp/file_store', workspace_base='/Users/Jessica/Downloads/OpenDevin/workspace', workspace_mount_path='/Users/Jessica/Downloads/OpenDevin/workspace', workspace_mount_path_in_sandbox='/workspace', workspace_mount_rewrite=None, cache_dir='/tmp/cache', sandbox_container_image='ghcr.io/opendevin/sandbox:latest', run_as_devin=False, max_iterations=100, e2b_api_key='**', sandbox_type='ssh', use_host_network=False, ssh_hostname='localhost', disable_color=False, sandbox_user_id=0, sandbox_timeout=120, github_token='**', jwt_secret='1d51ec3264a84251b8185a6227a4b43f', debug=False, enable_auto_lint=True 01:25:00 - opendevin:INFO: run_infer.py:353 - Using evaluation output directory: evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo-0125_maxiter_50_N_v1.4 01:25:00 - opendevin:INFO: run_infer.py:366 - Metadata: {'agent_class': 'CodeActAgent', 'model_name': 'gpt-3.5-turbo-0125', 'max_iterations': 50, 'eval_output_dir': 'evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo-0125_maxiter_50_N_v1.4', 'start_time': '2024-06-02 01:25:00', 'git_commit': '6ff50ed369163592041fdda5a7e9702ce79a17cc'} 01:25:00 - opendevin:INFO: run_infer.py:374 - Limiting evaluation to first 1 instances. 01:25:00 - opendevin:INFO: run_infer.py:378 - Writing evaluation output to evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo-0125_maxiter_50_N_v1.4/output.jsonl 01:25:00 - opendevin:WARNING: run_infer.py:385 - Output file evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo-0125_maxiter_50_N_v1.4/output.jsonl already exists. Loaded 0 finished instances. 01:25:00 - opendevin:INFO: run_infer.py:390 - Evaluation started with Agent CodeActAgent, model gpt-3.5-turbo-0125, max iterations 50. 01:25:00 - opendevin:INFO: run_infer.py:406 - Finished instances: 0, Remaining instances: 1 0%| | 0/1 [00:00<?, ?it/s]01:25:00 - opendevin:INFO: run_infer.py:427 - Using 8 workers for evaluation. 01:25:00 - opendevin:INFO: run_infer.py:431 - Skipping workspace mount: True 01:25:15 - opendevin:INFO: run_infer.py:214 - Starting evaluation for instance djangodjango-15202. Hint: run "tail -f evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo-0125_maxiter_50_N_v1.4/logs/instance_django__django-15202.log" to see live logs in a seperate shell 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 1/1 [00:50<00:00, 50.29s/it]ERROR:concurrent.futures:exception calling callback for <Future at 0x13746c9d0 state=finished raised BrowserException> concurrent.futures.process._RemoteTraceback: """ Traceback (most recent call last): File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/process.py", line 261, in _process_worker r = call_item.fn(*call_item.args, **call_item.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/run_infer.py", line 259, in process_instance state: State = asyncio.run( ^^^^^^^^^^^^ File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/runners.py", line 190, in run return runner.run(main) ^^^^^^^^^^^^^^^^ File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/runners.py", line 118, in run return self._loop.run_until_complete(task) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete return future.result() ^^^^^^^^^^^^^^^ File "/Users/Jessica/Downloads/OpenDevin/opendevin/core/main.py", line 92, in main runtime = ServerRuntime(event_stream=event_stream, sandbox=sandbox) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/Jessica/Downloads/OpenDevin/opendevin/runtime/server/runtime.py", line 35, in init super().init(event_stream, sid, sandbox) File "/Users/Jessica/Downloads/OpenDevin/opendevin/runtime/runtime.py", line 74, in init self.browser = BrowserEnv() ^^^^^^^^^^^^ File "/Users/Jessica/Downloads/OpenDevin/opendevin/runtime/browser/browser_env.py", line 41, in init__ raise BrowserException('Failed to start browser environment.') opendevin.runtime.browser.browser_env.BrowserException: Failed to start browser environment. """
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 340, in _invoke_callbacks
callback(self)
File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/run_infer.py", line 416, in update_progress
output = future.result()
^^^^^^^^^^^^^^^
File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 449, in result
return self.get_result()
^^^^^^^^^^^^^^^^^^^
File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 401, in get_result
raise self._exception
File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/run_infer.py", line 452, in
Why running as sudo
? The sandbox only needs root permission.
Here it is without sudo:
Jessicas-Computer:opendevin Jessica$ evaluation/swe_bench/scripts/run_infer.sh eval_gpt35_0125_preview CodeActAgent 1 AGENT: CodeActAgent AGENT_VERSION: v1.4 MODEL_CONFIG: eval_gpt35_0125_preview EVAL_LIMIT: 1 08:42:21 - opendevin.core.config:INFO: config.py:431 - Loading llm config from eval_gpt35_0125_preview 08:42:21 - opendevin:INFO: run_infer.py:330 - Config for evaluation: AppConfig(llm=LLMConfig(model='gpt-3.5-turbo-0125', api_key='**', base_url=None, api_version=None, embedding_model='local', embedding_base_url=None, embedding_deployment_name=None, aws_access_key_id='**', aws_secret_access_key='**', aws_region_name=None, num_retries=5, retry_min_wait=3, retry_max_wait=60, timeout=None, max_chars=5000000, temperature=0.0, top_p=0.5, custom_llm_provider=None, max_input_tokens=None, max_output_tokens=None), agent=AgentConfig(name='CodeActAgent', memory_enabled=False, memory_max_threads=2), runtime='server', file_store='memory', file_store_path='/tmp/file_store', workspace_base='/Users/Jessica/Downloads/OpenDevin/workspace', workspace_mount_path='/Users/Jessica/Downloads/OpenDevin/workspace', workspace_mount_path_in_sandbox='/workspace', workspace_mount_rewrite=None, cache_dir='/tmp/cache', sandbox_container_image='ghcr.io/opendevin/sandbox:latest', run_as_devin=False, max_iterations=100, e2b_api_key='**', sandbox_type='ssh', use_host_network=False, ssh_hostname='localhost', disable_color=False, sandbox_user_id=502, sandbox_timeout=120, github_token='**', jwt_secret='aeb9f928d61447e0b00feac3976e45e7', debug=False, enable_auto_lint=True 08:42:21 - opendevin:INFO: run_infer.py:353 - Using evaluation output directory: evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo-0125_maxiter_50_N_v1.4 08:42:21 - opendevin:INFO: run_infer.py:366 - Metadata: {'agent_class': 'CodeActAgent', 'model_name': 'gpt-3.5-turbo-0125', 'max_iterations': 50, 'eval_output_dir': 'evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo-0125_maxiter_50_N_v1.4', 'start_time': '2024-06-02 08:42:21', 'git_commit': '6ff50ed369163592041fdda5a7e9702ce79a17cc'} 08:42:21 - opendevin:INFO: run_infer.py:374 - Limiting evaluation to first 1 instances. 08:42:21 - opendevin:INFO: run_infer.py:378 - Writing evaluation output to evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo-0125_maxiter_50_N_v1.4/output.jsonl 08:42:21 - opendevin:WARNING: run_infer.py:385 - Output file evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo-0125_maxiter_50_N_v1.4/output.jsonl already exists. Loaded 0 finished instances. 08:42:21 - opendevin:INFO: run_infer.py:390 - Evaluation started with Agent CodeActAgent, model gpt-3.5-turbo-0125, max iterations 50. 08:42:21 - opendevin:INFO: run_infer.py:406 - Finished instances: 0, Remaining instances: 1 0%| | 0/1 [00:00<?, ?it/s]08:42:21 - opendevin:INFO: run_infer.py:427 - Using 8 workers for evaluation. 08:42:21 - opendevin:INFO: run_infer.py:431 - Skipping workspace mount: True 08:42:35 - opendevin:INFO: run_infer.py:214 - Starting evaluation for instance djangodjango-15202. Hint: run "tail -f evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo-0125_maxiter_50_N_v1.4/logs/instance_django__django-15202.log" to see live logs in a seperate shell 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 1/1 [01:00<00:00, 60.67s/it]ERROR:concurrent.futures:exception calling callback for <Future at 0x13acdcb50 state=finished raised BrowserException> concurrent.futures.process._RemoteTraceback: """ Traceback (most recent call last): File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/process.py", line 261, in _process_worker r = call_item.fn(*call_item.args, **call_item.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/run_infer.py", line 259, in process_instance state: State = asyncio.run( ^^^^^^^^^^^^ File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/runners.py", line 190, in run return runner.run(main) ^^^^^^^^^^^^^^^^ File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/runners.py", line 118, in run return self._loop.run_until_complete(task) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete return future.result() ^^^^^^^^^^^^^^^ File "/Users/Jessica/Downloads/OpenDevin/opendevin/core/main.py", line 92, in main runtime = ServerRuntime(event_stream=event_stream, sandbox=sandbox) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/Jessica/Downloads/OpenDevin/opendevin/runtime/server/runtime.py", line 35, in init super().init(event_stream, sid, sandbox) File "/Users/Jessica/Downloads/OpenDevin/opendevin/runtime/runtime.py", line 74, in init self.browser = BrowserEnv() ^^^^^^^^^^^^ File "/Users/Jessica/Downloads/OpenDevin/opendevin/runtime/browser/browser_env.py", line 41, in init__ raise BrowserException('Failed to start browser environment.') opendevin.runtime.browser.browser_env.BrowserException: Failed to start browser environment. """
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 340, in _invoke_callbacks
callback(self)
File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/run_infer.py", line 416, in update_progress
output = future.result()
^^^^^^^^^^^^^^^
File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 449, in result
return self.get_result()
^^^^^^^^^^^^^^^^^^^
File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 401, in get_result
raise self._exception
File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/run_infer.py", line 452, in
Sorry I don't quite know which part of the thread you're referring to -- do I what?
I don't need it to browse the internet, I just want to run gpt3.5 turbo 0125 inference on swe-bench and run eval on it.
@JessChud Could you please pull latest main? I notice you are running on an older version.
Is there an existing issue for the same bug?
Describe the bug
in my config file I have this (I changed from the default gpt4 setting):
[eval_gpt3.5_0125_preview] model = "gpt-3.5-turbo-0125"
I run the inference command: /Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/scripts/run_infer.sh eval_gpt3.5_0125_preview CodeActAgent 1
and get the following output. Was wondering if you could help resolve. Thanks!
JessicaComputer:swe_bench Jessica$ /Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/scripts/run_infer.sh eval_gpt3.5_0125_preview CodeActAgent 1 AGENT: CodeActAgent AGENT_VERSION: v1.4 MODEL_CONFIG: eval_gpt3.5_0125_preview EVAL_LIMIT: 1 09:34:30 - opendevin.core.config:INFO: config.py:431 - Loading llm config from eval_gpt3.5_0125_preview 09:34:30 - opendevin:INFO: run_infer.py:330 - Config for evaluation: AppConfig(llm=LLMConfig(model='gpt-3.5-turbo', api_key='**', base_url=None, api_version=None, embedding_model='local', embedding_base_url=None, embedding_deployment_name=None, aws_access_key_id='**', aws_secret_access_key='**', aws_region_name=None, num_retries=5, retry_min_wait=3, retry_max_wait=60, timeout=None, max_chars=5000000, temperature=0, top_p=0.5, custom_llm_provider=None, max_input_tokens=None, max_output_tokens=None), agent=AgentConfig(name='CodeActAgent', memory_enabled=False, memory_max_threads=2), runtime='server', file_store='memory', file_store_path='/tmp/file_store', workspace_base='/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/workspace', workspace_mount_path='/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/workspace', workspace_mount_path_in_sandbox='/workspace', workspace_mount_rewrite=None, cache_dir='/tmp/cache', sandbox_container_image='ghcr.io/opendevin/sandbox:latest', run_as_devin=False, max_iterations=100, e2b_api_key='**', sandbox_type='ssh', use_host_network=False, ssh_hostname='localhost', disable_color=False, sandbox_user_id=502, sandbox_timeout=120, github_token='**', jwt_secret='f5e1d9d83dd94b1098b59118fcf43d93', debug=False, enable_auto_lint=True 09:34:30 - opendevin:INFO: run_infer.py:353 - Using evaluation output directory: evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo_maxiter_50_N_v1.4 09:34:30 - opendevin:INFO: run_infer.py:366 - Metadata: {'agent_class': 'CodeActAgent', 'model_name': 'gpt-3.5-turbo', 'max_iterations': 50, 'eval_output_dir': 'evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo_maxiter_50_N_v1.4', 'start_time': '2024-05-30 09:34:30', 'git_commit': '6ff50ed369163592041fdda5a7e9702ce79a17cc'} 09:34:30 - opendevin:INFO: run_infer.py:374 - Limiting evaluation to first 1 instances. 09:34:30 - opendevin:INFO: run_infer.py:378 - Writing evaluation output to evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo_maxiter_50_N_v1.4/output.jsonl 09:34:30 - opendevin:WARNING: run_infer.py:385 - Output file evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo_maxiter_50_N_v1.4/output.jsonl already exists. Loaded 0 finished instances. 09:34:30 - opendevin:INFO: run_infer.py:390 - Evaluation started with Agent CodeActAgent, model gpt-3.5-turbo, max iterations 50. 09:34:30 - opendevin:INFO: run_infer.py:406 - Finished instances: 0, Remaining instances: 1 0%| | 0/1 [00:00<?, ?it/s]09:34:30 - opendevin:INFO: run_infer.py:427 - Using 8 workers for evaluation. 09:34:30 - opendevin:INFO: run_infer.py:431 - Skipping workspace mount: True 09:34:46 - opendevin:INFO: run_infer.py:214 - Starting evaluation for instance django__django-15202. Hint: run "tail -f evaluation/evaluation_outputs/outputs/swe_bench/CodeActAgent/gpt-3.5-turbo_maxiter_50_N_v1.4/logs/instance_django__django-15202.log" to see live logs in a seperate shell 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 1/1 [01:00<00:00, 60.04s/it]ERROR:concurrent.futures:exception calling callback for <Future at 0x139ffc050 state=finished raised ValueError> concurrent.futures.process._RemoteTraceback: """ Traceback (most recent call last): File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/process.py", line 261, in _process_worker r = call_item.fn(*call_item.args, **call_item.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/run_infer.py", line 259, in process_instance state: State = asyncio.run( ^^^^^^^^^^^^ File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/runners.py", line 190, in run return runner.run(main) ^^^^^^^^^^^^^^^^ File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/runners.py", line 118, in run return self._loop.run_until_complete(task) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete return future.result() ^^^^^^^^^^^^^^^ File "/Users/Jessica/Downloads/OpenDevin/opendevin/core/main.py", line 67, in main raise ValueError(f'Invalid toml file, cannot read {args.llm_config}') ValueError: Invalid toml file, cannot read eval_gpt3.5_0125_preview """
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 340, in _invoke_callbacks callback(self) File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/run_infer.py", line 416, in update_progress output = future.result() ^^^^^^^^^^^^^^^ File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 449, in result return self.get_result() ^^^^^^^^^^^^^^^^^^^ File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 401, in get_result raise self._exception File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/run_infer.py", line 452, in
future.result()
File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 456, in result
return self. get_result()
^^^^^^^^^^^^^^^^^^^
File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 401, in get_result
raise self._exception
ValueError: Invalid toml file, cannot read eval_gpt3.5_0125_preview
ERROR:root: File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 340, in _invoke_callbacks
callback(self)
File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/run_infer.py", line 416, in update_progress
output = future.result()
^^^^^^^^^^^^^^^
File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 449, in result
return self.get_result()
^^^^^^^^^^^^^^^^^^^
File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 401, in get_result
raise self._exception
File "/Users/Jessica/Downloads/OpenDevin/evaluation/swe_bench/run_infer.py", line 452, in
future.result()
File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 456, in result
return self. get_result()
^^^^^^^^^^^^^^^^^^^
File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 401, in get_result
raise self._exception
ERROR:root:<class 'ValueError'>: Invalid toml file, cannot read eval_gpt3.5_0125_preview 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 1/1 [01:03<00:00, 63.63s/it] Exception ignored in: <function _ExecutorManagerThread.init..weakref_cb at 0x139fb1800>
Traceback (most recent call last):
File "/usr/local/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/process.py", line 308, in weakref_cb
AttributeError: 'NoneType' object has no attribute 'util'
Current OpenDevin version
Installation and Configuration
Model and Agent
No response
Operating System
No response
Reproduction Steps
No response
Logs, Errors, Screenshots, and Additional Context
No response