Closed nhan000 closed 1 year ago
For whisper there's just one model exposed. The transcription quality depends on the audio record quality
Per OpenAI documentation you can use the large model Models - OpenAI API
Edit: I checked my OpenAI usage and apparently it actually uses the large model. No idea why the quality of the transcript is that bad compared to the results from running it locally.
Hey @nhan000 it should be the whisper v2 large which they call “whisper-1” in their AI. This should be the largest/best model and it’s the only one they have available via their api. I'd be curious on the differences between output between locally and in logseq when running it on the same files.
Hi I have a question.
Which Whisper model is used to transcribe the audio? It's really fast but the result is terrible for regular audio recordings (lecture recordings for example).
Could you add an option for people to choose between different models? Thanks a lot!