Sagittarius
What is this? A GPT-4/Gemini Voice/Video Exploration Tool!
Do you have an API key from either OpenAI or Gemini? You can use the tool online! No need to install anything.
See below for more context:
how to build
- clone this repo, cd into it
- duplicate
.env.example
and name the copy .env
- fill out the
VITE_OPENAI_KEY=
value with your OpenAI api key. you must have access to the gpt-4-vision-preview
model
- you can also try out the Gemini API if you have a key -- fill out
VITE_GEMINI_KEY
in the same .env
- then, run:
npm install
npm run dev
- the demo will be running at http://localhost:5173
note: the in-browser speech recognition works best in Google Chrome
TODO
- [x] allow input of API keys as
<input>
on the page
- [x] deploy frontend to site i.e. sagittarius.greg.technology via vite+github actions
- [x] enable streaming output..!
- [x] make new video with 3) streaming output / comparison
- [x] enable selection of dictation language
- [ ] make new video with 1) uses of repo in the wild / forks 2) UI improvements
- [ ] add allcontributors bot
- [ ] add dependabot