Open abrichr opened 2 weeks ago
(e.g. like https://github.com/nicholasoxford/computer-use-mac-demo or https://github.com/ashbuilds/computer-use) (i.e. based on https://docs.anthropic.com/en/docs/build-with-claude/computer-use)
We will be using ell, which requires building from this: https://github.com/OpenAdaptAI/OpenAdapt/pull/888
In https://github.com/OpenAdaptAI/OpenAdapt/issues/882 we are implementing a new strategy that just uses their model on the backend to directly predict coordinates. However I think we also want to extend their reference implementation (https://github.com/anthropics/anthropic-quickstarts/tree/main/computer-use-demo/computer_use_demo) to embed actions recorded by OpenAdapt, e.g. with a tool.
I'm not sure how this should work exactly. I think the first step is to understand with their code enough to suggest an approach.
New paradigm
https://huggingface.co/spaces/orby-osu/UGround
Feature request
(e.g. like https://github.com/nicholasoxford/computer-use-mac-demo or https://github.com/ashbuilds/computer-use) (i.e. based on https://docs.anthropic.com/en/docs/build-with-claude/computer-use)
We will be using ell, which requires building from this: https://github.com/OpenAdaptAI/OpenAdapt/pull/888
In https://github.com/OpenAdaptAI/OpenAdapt/issues/882 we are implementing a new strategy that just uses their model on the backend to directly predict coordinates. However I think we also want to extend their reference implementation (https://github.com/anthropics/anthropic-quickstarts/tree/main/computer-use-demo/computer_use_demo) to embed actions recorded by OpenAdapt, e.g. with a tool.
I'm not sure how this should work exactly. I think the first step is to understand with their code enough to suggest an approach.
Motivation
New paradigm