AmberSahdev / Open-Interface

Control Any Computer Using LLMs
GNU General Public License v3.0
747 stars 58 forks source link
assistant assistant-computer-control automation gpt gpt4 gpt4v gpt4vision linux llm machine-learning macos openai pyautogui pyinstaller python self-driving self-driving-software windows

Open Interface

Open Interface Logo

Full Autopilot for All Computers Using LLMs

Open Interface

Self-Driving Software for All Your Computers

[![macOS](https://img.shields.io/badge/mac%20os-000000?style=for-the-badge&logo=apple&logoColor=white)](https://github.com/AmberSahdev/Open-Interface?tab=readme-ov-file#install) [![Linux](https://img.shields.io/badge/Linux-FCC624?style=for-the-badge&logo=linux&logoColor=black)](https://github.com/AmberSahdev/Open-Interface?tab=readme-ov-file#install) [![Windows](https://img.shields.io/badge/Windows-0078D6?style=for-the-badge&logo=windows&logoColor=white)](https://github.com/AmberSahdev/Open-Interface?tab=readme-ov-file#install)
[![Github All Releases](https://img.shields.io/github/downloads/AmberSahdev/Open-Interface/total.svg)]((https://github.com/AmberSahdev/Open-Interface/releases/latest)) ![GitHub code size in bytes](https://img.shields.io/github/languages/code-size/AmberSahdev/Open-Interface) ![GitHub Repo stars](https://img.shields.io/github/stars/AmberSahdev/Open-Interface) ![GitHub](https://img.shields.io/github/license/AmberSahdev/Open-Interface) [![GitHub Latest Release)](https://img.shields.io/github/v/release/AmberSahdev/Open-Interface)](https://github.com/AmberSahdev/Open-Interface/releases/latest)

Demo 💻

["Make me a meal plan in Google Docs"]
Make Meal Plan Demo
More Demos


Install 💽

MacOS Logo MacOS
  • Download the MacOS binary from the latest release.
  • Unzip the file and move Open Interface to the Applications Folder.

Apple Silicon M-Series Macs
  • Open Interface will ask you for Accessibility access to operate your keyboard and mouse for you, and Screen Recording access to take screenshots to assess its progress.
  • In case it doesn't, manually add these permission via System Settings -> Privacy and Security

Intel Macs
  • Launch the app from the Applications folder.
    You might face the standard Mac "Open Interface cannot be opened" error.


    In that case, press "Cancel".
    Then go to System Preferences -> Security and Privacy -> Open Anyway.

       

  • Open Interface will also need Accessibility access to operate your keyboard and mouse for you, and Screen Recording access to take screenshots to assess its progress.


  • Lastly, checkout the Setup section to connect Open Interface to LLMs (OpenAI GPT-4V)
Linux Logo Linux
  • Linux binary has been tested on Ubuntu 20.04 so far.
  • Download the Linux zip file from the latest release.
  • Extract the executable and run it from the Terminal via
    ./Open\ Interface
  • Checkout the Setup section to connect Open Interface to LLMs (OpenAI GPT-4V)
Linux Logo Windows
  • Windows binary has been tested on Windows 10.
  • Download the Windows zip file from the latest release.
  • Unzip the folder, move the exe to the desired location, double click to open, and voila.
  • Checkout the Setup section to connect Open Interface to LLMs (OpenAI GPT-4V)

Setup 🛠️

Set up the OpenAI API key - Get your OpenAI API key - Open Interface needs access to GPT-4V to perform user requests. GPT-4V keys can be downloaded from your [OpenAI account](https://platform.openai.com/). - [Follow the steps here](https://help.openai.com/en/articles/8264644-what-is-prepaid-billing) to add balance to your OpenAI account. To unlock GPT-4V a minimum payment of $5 is needed. - [More info](https://help.openai.com/en/articles/7102672-how-can-i-access-gpt-4) - Save the API key in Open Interface settings - In Open Interface, go to the Settings menu on the top right and enter the key you received from OpenAI into the text field like so:

Set API key in settings

- After setting the API key for the first time you'll need to restart the app.
Optional: Setup a Custom LLM - Open Interface supports using other OpenAI API style LLMs (such as Llava) as a backend and can be configured easily in the Advanced Settings window. - Enter the custom base url and model name in the Advanced Settings window and the API key in the Settings window as needed.
Set API key in settings

- If your LLM does not support an OpenAI style API, you can use a library like [this](https://github.com/BerriAI/litellm) to convert it to one. - You will need to restart the app after these changes.

Stuff It’s Bad At (For Now) 😬

Future 🔮

(with better models trained on video walkthroughs like Youtube tutorials)

Notes 📝


System Diagram 🖼️

+----------------------------------------------------+
| App                                                |
|                                                    |
|    +-------+                                       |
|    |  GUI  |                                       |
|    +-------+                                       |
|        ^                                           |
|        |                                           |
|        v                                           |
|  +-----------+  (Screenshot + Goal)  +-----------+ |
|  |           | --------------------> |           | |
|  |    Core   |                       |    LLM    | |
|  |           | <-------------------- |  (GPT-4V) | |
|  +-----------+    (Instructions)     +-----------+ |
|        |                                           |
|        v                                           |
|  +-------------+                                   |
|  | Interpreter |                                   |
|  +-------------+                                   |
|        |                                           |
|        v                                           |
|  +-------------+                                   |
|  |   Executer  |                                   |
|  +-------------+                                   |
+----------------------------------------------------+

Star History ⭐️

Star History

Links 🔗

GitHub Repo stars