rewind.ai x cursor.com = your AI assistant that has all the context. 24/7 screen & voice recording for the age of super intelligence. get your data ready or be left behind
Deepgram streaming API, or whisper tiny/faster model, short chunk duration, and SSE for the transcript. We shouldn't need Web Sockets since we can use Tuari for handling any user interactions.
two paths:
just open a websocket API that uses whisper or deepgram with new audio code (not based on existing architecture or more likely copy paste)
might use more resource? or maybe not
on UI side it's just another window that show up on top of your app and stream the captions
cc @EzraEllette for opinion