OpenInterpreter / open-interpreter

A natural language interface for computers
http://openinterpreter.com/
GNU Affero General Public License v3.0
56.06k stars 4.85k forks source link

Please add support for Claude Computer Use API #1490

Closed Emasoft closed 1 month ago

Emasoft commented 1 month ago

Is your feature request related to a problem? Please describe.

Currently Open-Interpreter does not understand UI well. Letting it use the computer UI for us is still problematic.

Describe the solution you'd like

Please add support for Claude Computer Use API:

https://docs.anthropic.com/en/docs/build-with-claude/computer-use

Essentially you send in realtime to the API a screenshot of the current desktop screen, and Claude answer with the next operations to do (change mouse coordinates, press a button, click on a link, type an URL, drag, etc.) so that Open-Interpreter can do those actions autonomously.

IMG_6595 (source: https://x.com/alexalbert__/status/1848743043429810361 )

Describe alternatives you've considered

No response

Additional context

Some videos demo of Claude Computer use API here: https://x.com/rowancheung/status/1848743700702130474

MikeBirdTech commented 1 month ago

Done! Please upgrade to 0.4.0 and run with interpreter --os

Enjoy!