Support a more feature rich transcription API

Speaker diarization would be pretty cool, so this could be something we look into. I think there is a fork of OpenAI's whisper API (Which we currently use) that could be used to do this, but It may only work in Python. If so we might have to do some backend stuff which I am scared of. Although we may be able to get it running locally instead of through an API call which would be pretty cool. There is also google cloud speech-to-text which has diarization but would require a separate API key and I'm not sure if it supports translation as well as whisper does.

Carsonthemonkey / GIST

Support a more feature rich transcription API #26