For some language with poor / no automatic transcription models (e.g. tibetan) it would be super helpful to have a mode where the automatic transcription is not run at all. Instead the audio could be chunked and we could offer UI for manual transcription
For some language with poor / no automatic transcription models (e.g. tibetan) it would be super helpful to have a mode where the automatic transcription is not run at all. Instead the audio could be chunked and we could offer UI for manual transcription