haritheja-e / robot-utility-models

Robot Utility Models are trained on a diverse set of environments and objects, and then can be deployed in novel environments with novel objects without any further data or training.
https://robotutilitymodels.com
MIT License
173 stars 6 forks source link

how OpenAIClient work? #4

Closed lazywking closed 1 month ago

lazywking commented 1 month ago

Couldn't see anywhere OpenAIClient be called? how does OpenAIClient work in the project?

haritheja-e commented 1 month ago

Hi! Thank you for your interest in our work! We didn’t include code for running with GPT auto-retrying in our release as we hadn’t made it compatible with the UI (we tried to keep the UI minimal for easy use), and it requires command line use. I just pushed this code to a new branch gpt_retry if you’d like to check it out. It’s called through the _query_gpt function which is called here, and can be run in command line by entering the instruction e + ↵ when prompted after running run.py.

How it works:

  1. At the end of a fixed time interval, we send in every other image captured in the run into GPT-4o (either from wrist camera or head camera on our Hello Robot Stretch depending on task) and prompt it, again based on task. The response from GPT indicates task success/failure. All prompts can be found here.
  2. It retires from a new initial position if GPT-4o classifies the run as a failure.

If you’d be interested in having this code compatible with the UI, let us know. Thanks!