ggerganov / whisper.cpp

Port of OpenAI's Whisper model in C/C++
MIT License
34.93k stars 3.56k forks source link

reroute any audio output to input device with ffmpeg on linux #961

Open hbf731eF opened 1 year ago

hbf731eF commented 1 year ago

To use the sound output as an audio input with FFmpeg, you can make use of the pulse audio input device provided by FFmpeg. you need a running PulseAudio sound server(pavucontrol)

pactl list short sources ffmpeg -f pulse <input_options> -i <input_device>

examples/livestream.sh can so be used with any application on linux, that produces sound output

continuous stream in native fmt (this file will grow forever!) -fmt=aac # the audio format extension of the stream (TODO: auto detect) +fmt=wav

-ffmpeg -loglevel quiet -y -re -probesize 32 -i $url -c copy /tmp/whisper-live0.${fmt} & +ffmpeg -loglevel quiet -y -re -probesize 32 -f pulse -i 0 -c copy /tmp/whisper-live0.${fmt} &

StuartIanNaylor commented 1 year ago

You can do it with Alsa by just using a loopback (sometimes rarely its not enabled in the kernel but usually is)

sudo modprobe snd-aloop

With a loopback you just play to one side (sink) and it becomes avail on the other side (source) No need for pulseaudio just the base ALSA audio.

schnz commented 1 year ago

In case pulseaudio is used, there is also a convenient method to allow recording from any output device (see https://superuser.com/questions/1536203/how-to-redirect-sound-sink-to-source)

For all output devices (sinks) in Pulseaudio, it automatically creates a monitor. The monitor is a recording device (source) that playbacks anything that plays through the output device but usually is not recognized by other applications as such (e.g in the stream example of whisper.cpp, the monitor devices are not recognized and as far as I can tell, SDL2 doesn't provide a way to access them). As a workaround, it is possible to simply remap that source to a different name and et voilà: It can now be seen by other applications as input source.

Simply find the monitor device that you want to record from (i.e. the monitor of the output device that you are using to play your sound, e.g., headphones, HDMI, ...) via pactl list sources. Take note of its name and then create the remapped device like this:

pactl load-module module-remap-source master=<name of the monitor device> source_name=virt_mic source_properties=device.description=VirtualMic

Again: Check the superuser.com thread for more infos.

StuartIanNaylor commented 1 year ago

Same with https://www.freedesktop.org/wiki/Software/PulseAudio/Documentation/User/Modules/#module-loopback its just the Pulse version of a Loopback to sink. That way you don't have to have a hardware play device involved as with the ALSA one above.