gregsadetsky / sagittarius

A GPT-4/Gemini Voice/Video Exploration Tool

http://sagittarius.greg.technology/

685 stars 94 forks source link

readme

Sagittarius

What is this? A GPT-4/Gemini Voice/Video Exploration Tool!

Do you have an API key from either OpenAI or Gemini? You can use the tool online! No need to install anything.

See below for more context:

how to build

clone this repo, cd into it
duplicate .env.example and name the copy .env
fill out the VITE_OPENAI_KEY= value with your OpenAI api key. you must have access to the gpt-4-vision-preview model
- you can also try out the Gemini API if you have a key -- fill out VITE_GEMINI_KEY in the same .env
then, run:
npm install
npm run dev
the demo will be running at http://localhost:5173

note: the in-browser speech recognition works best in Google Chrome

TODO

[x] allow input of API keys as <input> on the page
[x] deploy frontend to site i.e. sagittarius.greg.technology via vite+github actions
[x] enable streaming output..!
[x] make new video with 3) streaming output / comparison
[x] enable selection of dictation language
[ ] make new video with 1) uses of repo in the wild / forks 2) UI improvements
[ ] add allcontributors bot
[ ] add dependabot