Note: This project is currently in alpha and prone to breaking changes.
This project is a browser extension for creating web-based automations on the fly. Capture different actions automatically for later replay or create fine-tuned steps in an intuitive, no-code way.
Because this project is still early on, we have not released it on the various extension platforms. Load the extension manually instead. Use pnpm to build the project and manage dependencies. Install and run the following:
$ pnpm install
$ pnpm build
This will produce a dist/chrome-mv3
directory containing the unpacked
extension. Navigate to Chrome > Manage Extensions > Load unpacked
and select
this directory. Consider pinning the newly installed extension.
The side panel contains four top-level tabs. We touch on each in turn.
This is the primary workhorse of the application. Here you can provide a sequence of steps that can be replayed automatically later on. As of now, we have the following steps available:
Extracting
Triggering this step converts your cursor to a crosshair and shows outlines around elements you hover over. On each click, the plugin will record which element you targeted. Name each capture you perform in this way for interpolating into subsequent steps.
Navigating
Very simply navigates to the specified URL. You can use interpolated values in the required field. For example, you can specify a URL like:
https://github.com/{PROFILE}
if you have a PROFILE
interpolation available to you.
OpenAI
Allows sending a custom chat completion request to OpenAI. The response is always formatted as a JSON object, though you are free to specify what the keys of the object are. These key/value pairs will be available for interpolating into subsequent steps.
Furthermore, you can use any interpolated values already available in both the system and user prompt.
Recording
Allows interacting with the webpage as normal, but captures clicks and key presses automatically.
A repository of all previously saved scripts. From here you can edit, delete, or execute (i.e. run) the script.
After running a script from your library, this tab will populate with details on each step as they execute. It will also surface any errors encountered during.
One particular step available in the builder tab is the OpenAI
step. This
sends a custom chat completion request to OpenAI, but
requires an API key. You can configure the API key from here.
We use the WXT framework to give room for cross-platform functionality, though efforts are currently focused on Chrome and Chromium-based browsers. As such, organization of files mirror WXT's recommendations:
assets
components/ui
components/icons
entrypoints/background.ts
entrypoints/*.content
entrypoints/sidepanel
public
utils
Formatting depends on prettier. A pre-commit
hook is
included in .githooks
that can be used to format all *.jsx?
and *.tsx?
files prior to commit. Install via:
$ git config --local core.hooksPath .githooks/