OthersideAI / self-operating-computer

A framework to enable multimodal models to operate a computer.
https://www.hyperwriteai.com/self-operating-computer
MIT License
8.68k stars 1.15k forks source link

add claude support #180

Closed roywei closed 6 months ago

roywei commented 6 months ago

What does this PR do?

adding support for claude 3 apis, tested with claude 3 opus, and passed evaluation. Few things I had to work around:

  1. Json mode is not very reliable I had to add system prompt, each user message and also try catch for it to rework on output. Claude like to add a short explanation at end of response.
  2. downsize screenshot due to 5MB vision api limit
  3. limit the OCR text, it seems claude tend to output long text to click on

Fixes # (issue)

Requirement/Documentation

Type of change

Mandatory Tasks

joshbickett commented 6 months ago

Thanks for this PR. I was just thinking we should add Claude.

I'll review this week

joshbickett commented 6 months ago

This PR looks good. The only thing I am noticing may not be working as expected is that gpt_4_fallback function may not work with the message structure for Claude. I'll take a closer look tomorrow and approve if all good

roywei commented 6 months ago

nice catch! updated and fallback seems working

Screenshot 2024-03-18 at 10 21 49 PM
joshbickett commented 6 months ago

@roywei PR looks good now. Made a few slightly adjustments and merged!

Thanks again for this PR. I'll let the community know about it on Twitter and mention you.

joshbickett commented 6 months ago

I've been wanting to post about it, but I think Anthropic is crashed : ( lol

anthropic.InternalServerError: Error code: 529 - {'type': 'error', 'error': {'type': 'overloaded_error', 'message': 'Overloaded'}}
joshbickett commented 6 months ago

@roywei With some more testing I still see the memory limit on Anthropics API get it with the screenshot. We may need to compress it more when it surpasses the threshold. I'll add that tomorrow morning or you want to work on it feel free to make a new PR