Const-me / Whisper

High-performance GPGPU inference of OpenAI's Whisper automatic speech recognition (ASR) model
Mozilla Public License 2.0
8.43k stars 721 forks source link

Application Audio Capture #29

Open theFroh opened 1 year ago

theFroh commented 1 year ago

Hello! Thanks for putting together this project, performance on my (non Ti) 1080 seems remarkably good!

One thing I'd love to see is the ability to stream capture audio from an application rather than from a microphone. Something like https://github.com/bozbez/win-capture-audio ?

Cheers!

bigdoug2005 commented 1 year ago

Being able to capture live audio from a microphone and an audio stream concurrently would be an awesome way to quickly capture virtual meeting notes.

emcodem commented 1 year ago

In my mind such functionality has no place in a library like this. This library is speech-to-text. Other libraries are routing audio from application to device. Yet other libraries are for decoding any audio file or capture from strange sources, applying smart filtering. All of this has no place in a speech-to-text library if you ask me (well ok it's not like anyone asked me :D). A speech-to-text library should concentrate on it's own task and just allow others to interface in a generic way - which this library already does.

You can just use the existing live interface stream.exe and one of the many existing random tools like vb-audio virtual cable to accomplish your specific "capture from application" task.