OpenInterpreter / open-interpreter

A natural language interface for computers
http://openinterpreter.com/
GNU Affero General Public License v3.0
50.54k stars 4.4k forks source link

Claude 3 - Crash after code execution #1056

Open JeremyAlain opened 3 months ago

JeremyAlain commented 3 months ago

Describe the bug

I am running open interpreter with the new claude 3 model on a google colab.. Here are the parameters.

interpreter.llm.model = "claude-3-opus-20240229" interpreter.llm.context_window = 200000 interpreter.llm.max_tokens = 4000

The model runs fine until it starts executing code. For instance it writes some code to simulate an unbiased coinflip, it executes it, the result of the code (i.e. the output of the python function) actually gets printed out, but then the whole program just blocks and does not continue.

If run the same thing but with gpt-4 it works flawlessly.

Reproduce

  1. open a google colab
  2. Set the following parameter:

interpreter.llm.model = "claude-3-opus-20240229" interpreter.llm.context_window = 200000 interpreter.llm.max_tokens = 4000

  1. interpreter.chat()

  2. type "Run a simulation of 1000 unbiased coinflips and report the result."

Expected behavior

Run code, and then report the result to the user who can then pass new comands.

Screenshots

No response

Open Interpreter version

0.2.0

Python version

3.10.12

Operating System name and version

google colab

Additional context

No response

vaibhavard commented 3 months ago

Similar issue:

{"type":"error","error":{"type":"invalid_request_error","message":"messages: all messages must have non-empty content
except for the optional final assistant message"}}
jonathanhawleypeters commented 3 months ago

Yeah, when Claude3 works with open-interpreter it's amazing, but errors like the following are all too frequent. I can't tell exactly what's going on, but in the middle of the stacktrace, there's:

litellm.llms.anthropic.AnthropicError:
{"type":"error","error":{"type":"authentication_error","message":"invalid x-api-key"}}

This is strange, because open-interpreter has successfully started generating output with Claude's help.

Further down the trace, there's this:

litellm.exceptions.BadRequestError: AnthropicException -
{"type":"error","error":{"type":"invalid_request_error","message":"messages: all messages must have
non-empty content except for the optional final assistant message"}}
Anyway, here's a full example stack trace, hopefully it's of some use. ```error [0307/154004.466361:ERROR:trust_store_mac.cc(750)] Error parsing certificate: ERROR: Failed parsing extensions 33409 bytes written to file /Users/***/Library/Application Support/Open Interpreter/3561784246.png Python Version: 3.11.0 Pip Version: 24.0 Open-interpreter Version: cmd:Interpreter, pkg: 0.2.0 OS Version and Architecture: macOS-14.3-arm64-arm-64bit CPU Info: arm RAM Info: 16.00 GB, used: 8.39, free: 0.32 # Interpreter Info Vision: False Model: claude-3-sonnet-20240229 Function calling: None Context window: None Max tokens: None Auto run: True API base: None Offline: False Curl output: Not local # Messages System Message: You are Open Interpreter, a world-class programmer that can complete any goal by executing code. First, write a plan. **Always recap the plan between each code block** (you have extreme short-term memory loss, so you need to recap the plan between each message block to retain it). When you execute code, it will be executed **on the user's machine**. The user has given you **full and complete permission** to execute any code necessary to complete the task. Execute the code. If you want to send data between programming languages, save the data to a txt or json. You can access the internet. Run **any code** to achieve the goal, and if at first you don't succeed, try again and again. You can install new packages. When a user refers to a filename, they're likely referring to an existing file in the directory you're currently executing code in. Write messages to the user in Markdown. In general, try to **make plans** with as few steps as possible. As for actually executing code to carry out that plan, for *stateful* languages (like python, javascript, shell, but NOT for html which starts from 0 every time) **it's critical not to try to do everything in one code block.** You should try something, print information about it, then continue from there in tiny, informed steps. You will never get it on the first try, and attempting it in one go will often lead to errors you cant see. You are capable of **any** task. {'role': 'user', 'type': 'message', 'content': "let's build a notetaking app"} {'role': 'assistant', 'type': 'message', 'content': "Here's a plan to build a simple note-taking app using HTML, Bulma CSS, and JavaScript:\n\n1. **Plan**: Create a user interface with an input field to enter notes, a button to add notes, and a container to display existing notes.\n\n2. **Set up HTML structure**: Create a centered box with Bulma CSS class... create a new list item with the note text, and append it to the unordered list.\n - Clear the input field after adding a note.\n\n5. **Load existing notes (optional)**: Retrieve previously saved notes from localStorage (or any other storage mechanism) and add them to the unordered list on page load."} {'role': 'user', 'type': 'message', 'content': "let's do it!"} {'role': 'assistant', 'type': 'code', 'format': 'html', 'content': '\n\n< lang="en">\n\n \n \n Note Taking App\n \n