sveinbjornt / hear

Command line speech recognition and transcription for macOS
https://sveinbjorn.org/hear
BSD 3-Clause "New" or "Revised" License
390 stars 24 forks source link

In-computer audio input #31

Open cepsong opened 4 months ago

cepsong commented 4 months ago

Is there any way to take audio input from inside the computer (any application), maybe via a virtual audio driver. For now, it can only take in default microphone input. Thanks. @sveinbjornt @MrYakobo @adisidev

sveinbjornt commented 4 months ago

You should be able to change your default audio input source in System Settings to a virtual audio driver and everything should work. The code in hear just uses the default input source.

cepsong commented 4 months ago

You should be able to change your default audio input source in System Settings to a virtual audio driver and everything should work. The code in hear just uses the default input source.

Thank you @sveinbjornt. I am looking at a use case where I need to use my default external microphone to take in my voice, and the virtual audio driver to transcribe what I hear (from a video or other speakers in a virtual meeting) but not my own speech. So can this be modified to take in non-default (any) virtual audio driver? Thanks so much.

sveinbjornt commented 4 months ago

Why not do this post hoc? Does this need to be (near) live? If you have both audio channels written to disk there is no problem with post hoc processing using input files.

cepsong commented 4 months ago

Why not do this post hoc? Does this need to be (near) live? If you have both audio channels written to disk there is no problem with post hoc processing using input files.

Thank you @sveinbjornt. Yes it has to be live as we are developing real-time ASR of a meeting while the person using it is speaking at the same time in a different language to another program or channel. So the virtual audio driver needs to send the output as input to "hear", while the default microphone is used by the user as input for another program or channel. Can this be done? Thanks.

sveinbjornt commented 4 months ago

Should be simple enough to modify the hear source code to do what you want. Maybe I'll implement a flag to specify audio input device at a future date. Keeping this open.

cepsong commented 4 months ago

Should be simple enough to modify the hear source code to do what you want. Maybe I'll implement a flag to specify audio input device at a future date. Keeping this open.

That is highly appreciated @sveinbjornt. Just a side track, would it be possible to (modify and) extend this to iOS since the services are all provided by Apple?

sveinbjornt commented 4 months ago

Well, hear uess a macOS API (which is probably also available on iOS) but people don't run command line programs on iOS so that's probably a no-go.