Open twilwa opened 1 month ago
@twilwa can you try to do
screenpipe --list-audio-device
and then
play some video with voice and then
screenpipe --audio-device <the name of it> --disable-vision
just to confirm audio output does not work
Maybe a new issue needed, or perhaps expected behavior -- when building from source, screenpipe isn't located in /usr/local/bin or added to $PATH. I can run the binary okay from the installation location, just as a heads up. Audio devices output:
anon@pop-os:~/repos/screenpipe/target/release$ ./screenpipe --list-audio-devices
ALSA lib pcm_oss.c:397:(_snd_pcm_oss_open) Cannot open device /dev/dsp
ALSA lib pcm_oss.c:397:(_snd_pcm_oss_open) Cannot open device /dev/dsp
ALSA lib pcm_dmix.c:972:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dmix.c:972:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dmix.c:972:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dmix.c:972:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dmix.c:972:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dmix.c:972:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dmix.c:972:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dmix.c:972:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dmix.c:972:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dmix.c:972:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dmix.c:972:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dmix.c:972:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dmix.c:972:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dmix.c:972:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dsnoop.c:540:(snd_pcm_dsnoop_open) The dsnoop plugin supports only capture stream
ALSA lib pcm_route.c:877:(find_matching_chmap) Found no matching channel map
ALSA lib pcm_route.c:877:(find_matching_chmap) Found no matching channel map
ALSA lib pcm_route.c:877:(find_matching_chmap) Found no matching channel map
ALSA lib pcm_route.c:877:(find_matching_chmap) Found no matching channel map
ALSA lib pcm_route.c:877:(find_matching_chmap) Found no matching channel map
ALSA lib pcm_route.c:877:(find_matching_chmap) Found no matching channel map
ALSA lib pcm_dmix.c:972:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dmix.c:972:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dmix.c:972:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dmix.c:972:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dsnoop.c:540:(snd_pcm_dsnoop_open) The dsnoop plugin supports only capture stream
ALSA lib pcm_dsnoop.c:540:(snd_pcm_dsnoop_open) The dsnoop plugin supports only capture stream
ALSA lib pcm_dmix.c:999:(snd_pcm_dmix_open) unable to open slave
ALSA lib pcm_dsnoop.c:540:(snd_pcm_dsnoop_open) The dsnoop plugin supports only capture stream
ALSA lib pcm_oss.c:397:(_snd_pcm_oss_open) Cannot open device /dev/dsp
ALSA lib pcm_oss.c:397:(_snd_pcm_oss_open) Cannot open device /dev/dsp
ALSA lib pcm_dmix.c:972:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dmix.c:972:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dmix.c:972:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dmix.c:972:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dmix.c:972:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dmix.c:972:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dmix.c:972:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dsnoop.c:540:(snd_pcm_dsnoop_open) The dsnoop plugin supports only capture stream
ALSA lib pcm_dsnoop.c:540:(snd_pcm_dsnoop_open) The dsnoop plugin supports only capture stream
ALSA lib pcm_route.c:877:(find_matching_chmap) Found no matching channel map
ALSA lib pcm_route.c:877:(find_matching_chmap) Found no matching channel map
ALSA lib pcm_route.c:877:(find_matching_chmap) Found no matching channel map
ALSA lib pcm_route.c:877:(find_matching_chmap) Found no matching channel map
ALSA lib pcm_route.c:877:(find_matching_chmap) Found no matching channel map
ALSA lib pcm_route.c:877:(find_matching_chmap) Found no matching channel map
ALSA lib pcm_dmix.c:972:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dmix.c:972:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dsnoop.c:540:(snd_pcm_dsnoop_open) The dsnoop plugin supports only capture stream
ALSA lib pcm_dsnoop.c:540:(snd_pcm_dsnoop_open) The dsnoop plugin supports only capture stream
ALSA lib pcm_dsnoop.c:540:(snd_pcm_dsnoop_open) The dsnoop plugin supports only capture stream
ALSA lib pcm_dsnoop.c:540:(snd_pcm_dsnoop_open) The dsnoop plugin supports only capture stream
ALSA lib pcm_dmix.c:999:(snd_pcm_dmix_open) unable to open slave
ALSA lib pcm_dmix.c:999:(snd_pcm_dmix_open) unable to open slave
ALSA lib pcm_dsnoop.c:540:(snd_pcm_dsnoop_open) The dsnoop plugin supports only capture stream
ALSA lib pcm_dsnoop.c:540:(snd_pcm_dsnoop_open) The dsnoop plugin supports only capture stream
ALSA lib pcm_oss.c:397:(_snd_pcm_oss_open) Cannot open device /dev/dsp
ALSA lib pcm_oss.c:397:(_snd_pcm_oss_open) Cannot open device /dev/dsp
ALSA lib pcm_dmix.c:972:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dmix.c:972:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dmix.c:972:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dmix.c:972:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dmix.c:972:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dmix.c:972:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dmix.c:972:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dsnoop.c:540:(snd_pcm_dsnoop_open) The dsnoop plugin supports only capture stream
ALSA lib pcm_route.c:877:(find_matching_chmap) Found no matching channel map
ALSA lib pcm_route.c:877:(find_matching_chmap) Found no matching channel map
ALSA lib pcm_route.c:877:(find_matching_chmap) Found no matching channel map
ALSA lib pcm_route.c:877:(find_matching_chmap) Found no matching channel map
ALSA lib pcm_route.c:877:(find_matching_chmap) Found no matching channel map
ALSA lib pcm_route.c:877:(find_matching_chmap) Found no matching channel map
ALSA lib pcm_dmix.c:972:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dmix.c:972:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dsnoop.c:540:(snd_pcm_dsnoop_open) The dsnoop plugin supports only capture stream
ALSA lib pcm_dsnoop.c:540:(snd_pcm_dsnoop_open) The dsnoop plugin supports only capture stream
ALSA lib pcm_dmix.c:999:(snd_pcm_dmix_open) unable to open slave
ALSA lib pcm_dsnoop.c:540:(snd_pcm_dsnoop_open) The dsnoop plugin supports only capture stream
available audio devices:
jack (input)
pipewire (input)
pulse (input)
default (input)
hw:CARD=Device,DEV=0 (input)
plughw:CARD=Device,DEV=0 (input)
sysdefault:CARD=Device (input)
front:CARD=Device,DEV=0 (input)
surround40:CARD=Device,DEV=0 (input)
iec958:CARD=Device,DEV=0 (input)
dsnoop:CARD=Device,DEV=0 (input)
hw:CARD=Generic,DEV=0 (input)
hw:CARD=Generic,DEV=2 (input)
plughw:CARD=Generic,DEV=0 (input)
plughw:CARD=Generic,DEV=2 (input)
sysdefault:CARD=Generic (input)
front:CARD=Generic,DEV=0 (input)
surround40:CARD=Generic,DEV=0 (input)
surround51:CARD=Generic,DEV=0 (input)
surround71:CARD=Generic,DEV=0 (input)
dsnoop:CARD=Generic,DEV=0 (input)
dsnoop:CARD=Generic,DEV=2 (input)
hw:CARD=V11,DEV=0 (input)
plughw:CARD=V11,DEV=0 (input)
sysdefault:CARD=V11 (input)
front:CARD=V11,DEV=0 (input)
dsnoop:CARD=V11,DEV=0 (input)
jack (output)
pipewire (output)
pulse (output)
default (output)
hw:CARD=HDMI,DEV=3 (output)
hw:CARD=HDMI,DEV=7 (output)
hw:CARD=HDMI,DEV=8 (output)
hw:CARD=HDMI,DEV=9 (output)
hw:CARD=HDMI,DEV=10 (output)
hw:CARD=HDMI,DEV=11 (output)
plughw:CARD=HDMI,DEV=3 (output)
plughw:CARD=HDMI,DEV=7 (output)
plughw:CARD=HDMI,DEV=8 (output)
plughw:CARD=HDMI,DEV=9 (output)
plughw:CARD=HDMI,DEV=10 (output)
plughw:CARD=HDMI,DEV=11 (output)
hdmi:CARD=HDMI,DEV=0 (output)
hdmi:CARD=HDMI,DEV=1 (output)
hdmi:CARD=HDMI,DEV=2 (output)
hdmi:CARD=HDMI,DEV=3 (output)
hdmi:CARD=HDMI,DEV=4 (output)
hdmi:CARD=HDMI,DEV=5 (output)
dmix:CARD=HDMI,DEV=3 (output)
dmix:CARD=HDMI,DEV=7 (output)
dmix:CARD=HDMI,DEV=8 (output)
dmix:CARD=HDMI,DEV=9 (output)
dmix:CARD=HDMI,DEV=10 (output)
dmix:CARD=HDMI,DEV=11 (output)
hw:CARD=Device,DEV=0 (output)
plughw:CARD=Device,DEV=0 (output)
sysdefault:CARD=Device (output)
front:CARD=Device,DEV=0 (output)
surround40:CARD=Device,DEV=0 (output)
iec958:CARD=Device,DEV=0 (output)
dmix:CARD=Device,DEV=0 (output)
hw:CARD=Generic,DEV=0 (output)
hw:CARD=Generic,DEV=1 (output)
plughw:CARD=Generic,DEV=0 (output)
plughw:CARD=Generic,DEV=1 (output)
sysdefault:CARD=Generic (output)
front:CARD=Generic,DEV=0 (output)
surround40:CARD=Generic,DEV=0 (output)
surround51:CARD=Generic,DEV=0 (output)
surround71:CARD=Generic,DEV=0 (output)
iec958:CARD=Generic,DEV=0 (output)
dmix:CARD=Generic,DEV=0 (output)
dmix:CARD=Generic,DEV=1 (output)
surround21:CARD=Device,DEV=0 (output)
surround41:CARD=Device,DEV=0 (output)
surround50:CARD=Device,DEV=0 (output)
surround51:CARD=Device,DEV=0 (output)
surround71:CARD=Device,DEV=0 (output)
The same behavior occurs when selecting device 'pipewire (output)' or 'pipewire (input)' -- they both capture mic input, not loopback audio. Any particular device you'd be keen to test?
@twilwa
indeed for path: #421
can you try to change this part of the code
by just
let config = cpal_audio_device.default_input_config()?
and build and test again with output device, i'm wondering if the behaviour is different on linux maybe
My hunch might be that the
2024-10-09T02:25:39.770707Z INFO screenpipe_audio::stt: device: pipewire (input), resampling from 44100 Hz to 16000 Hz
could be causing some issues -- pipewire (and presumably pipewire-pulse) come with a default setting of, i beleive, 44100.
possibly useful information, pw-metadata -n settings:
anon@pop-os:~/repos/screenpipe/target/release$ pw-metadata -n settings
Found "settings" metadata 31
update: id:0 key:'log.level' value:'2' type:''
update: id:0 key:'clock.rate' value:'48000' type:''
update: id:0 key:'clock.allowed-rates' value:'[ 44100, 48000, 88200, 96000, 176400, 192000, 352800, 384000 ]' type:''
update: id:0 key:'clock.quantum' value:'1024' type:''
update: id:0 key:'clock.min-quantum' value:'32' type:''
update: id:0 key:'clock.max-quantum' value:'2048' type:''
update: id:0 key:'clock.force-quantum' value:'0' type:''
update: id:0 key:'clock.force-rate' value:'0' type:''
pw-metadata settings:
anon@pop-os:~/repos/screenpipe/target/release$ pw-metadata settings
Found "default" metadata 36
update: id:0 key:'default.configured.audio.sink' value:'{"name":"alsa_output.usb-0c76_USB_PnP_Audio_Device-00.analog-stereo"}' type:'Spa:String:JSON'
update: id:0 key:'default.configured.audio.source' value:'{"name":"alsa_input.usb-0c76_USB_PnP_Audio_Device-00.mono-fallback"}' type:'Spa:String:JSON'
update: id:0 key:'default.audio.sink' value:'{"name":"alsa_output.usb-0c76_USB_PnP_Audio_Device-00.analog-stereo"}' type:'Spa:String:JSON'
update: id:0 key:'default.audio.source' value:'{"name":"alsa_input.usb-0c76_USB_PnP_Audio_Device-00.mono-fallback"}' type:'Spa:String:JSON'
@twilwa what is happening exactly? no transcriptions or? if you play something in audio output do you hear it when listening to the .mp4 saved to disk?
it's supposed to work with 96khz, 48 khz, 44,1khz etc.
It records my microphone input but never the loopback audio output. Will try the code change after this restart.
My obs just started segfaulting after I changed a few settings, but it's been acting up today. Planning to install JACK so I have something a little more robust to work with -- might resolve the screenpipe issues as well. That said I think most distros don't come with it, so pipewire/pulse would probably be best in terms of making sure they generally work on common distros (Ubuntu, Debian, etc.)
can you do
cargo build --release # can add --features mkl
./target/release/screenpipe --audio-device "pipewire (output)"
# or
./target/release/screenpipe --audio-device "jack (output)"
# and play some voice audio
Prefer that with or without the change described in https://github.com/mediar-ai/screenpipe/issues/450#issuecomment-2401147071 , or doesn't matter?
atm building without the code change, behavior is the same on pipewire(output) so far
with jack (after installing actual jack rather than pipewire-pulse emulating jack, which i beleive is the default. cadence does pick up the audio levels, so i'm presuming the jack server is live.:
2024-10-09T03:29:45.705238Z INFO screenpipe_audio::stt: device: jack (output), resampling from 48000 Hz to 16000 Hz
2024-10-09T03:29:45.707298Z INFO screenpipe_audio::stt: device: jack (output), total audio frames processed: 0, frames that include speech: 0, speech duration: 0ms, speech ratio: NaN, min required ratio: 0.02
2024-10-09T03:29:45.720278Z INFO screenpipe_audio::core: Recording jack (output) for 30 seconds
cannot connect system:capture_2 to alsa-jack.jackC.38170.11:in_001
2024-10-09T03:29:45.721176Z ERROR screenpipe_audio::core: Failed to build input stream: A backend-specific error has occurred: ALSA function 'snd_pcm_hw_params' failed with error 'I/O error (5)'
As a side note, when doing some obs troubleshooting, running it with 'sudo' prevents the segfault. Could try the same with screen pipe.
Tested a little, no change in behavior (for jack, no audio files are output at all) but sudo does change the error message:
2024-10-09T03:36:56.814768Z ERROR screenpipe_server::core: Error in record_and_transcribe for device jack (output) (iteration 1): Audio device not found, stopping thread
when running with sudo -- the server-based audio devices don't show up, so I tried sysdefault:CARD=Generic (output), which doesn't error, but doesn't pick anything up.
After confirming with aplay -L that the device is the one associated with my current output audio:
2024-10-09T05:16:59.334042Z ERROR screenpipe_server::core: Error in record_and_transcribe for device iec958:CARD=Device,DEV=0 (output) (iteration 1): Audio device not found, stopping thread
the device does appear in the list-audio-devices list
let config = cpal_audio_device.default_input_config()
tested rebuilding the new UI / rebuilding the binary with this modification as requested here: https://github.com/mediar-ai/screenpipe/issues/450#issuecomment-2401147071, and played around for a while with my system audio packages, making sure everything was installed and updated, poked around a few audio forums. i will mention that audio does work very differently on linux than it does on mac or windows -- my setup (and ubuntu default, now) runs pipewire-pulse, which is a pipewire server that emulates pulseaudio as far as my understanding goes. Then there's JACK, which isn't a default, but is used in a lot of audio routing programs, which needs to play nice with ALSA (same goes for pulse/pipewire, if I'm not mistaken.)
In looking into it a bit, it seems like the default audio configuration for Ubuntu is pipewire-pulse with wireplumber as a config manager. Supporting this by default is likely the best option, JACK and ALSA could remain lower priority.
In running main.rs through o1-mini, something interesting popped up:
let config = cpal_audio_device.default_input_config();
(requested change in the comment -- did you mean to say output? I reverted after testing the build, in any case.)
Issue: This line retrieves the default input configuration regardless of whether the device type is Input or Output. If DeviceType::Output is selected, it should retrieve the default output configuration.
b. Stream Building in record_and_transcribe The record_and_transcribe function builds an input stream regardless of the device type:
let stream = match config.sample_format() {
cpal::SampleFormat::I8 => cpal_audio_device.build_input_stream(
// ...
),
// Other sample formats...
};
Issue: When DeviceType::Output is selected, the program should build an output stream instead of an input stream. Additionally, capturing output audio typically requires selecting a monitor device (e.g., "Monitor of
If we're building the wrong stream and not checking for monitor devices, this could explain why we're only able to capture microphone input and manually selecting hardware devices doesn't find anything -- although it seems odd to me if it's working normally on Mac or Windows (especially windows).
Happy to continue tinkering, doubly so if you're up to put up a bounty if I can manage to get it up and running on my main PC + my Ubuntu alt? Never done rust before, but been meaning to learn for ages.
Was tinkering a bit with it again, today, curious if anyone else on Linux has found a workaround yet?
describe the bug a clear and concise description of what the bug is. Screenpipe will capture audio, but no matter the source (pipe, jack, pulse, or any device option), the captured audio will always be microphone input rather than loopback device audio.
Secondarily, 'save and restart' in the Settings menu only closes the application and doesn't relaunch (possibly because built from source, but thought i'd mention)
to reproduce steps to reproduce the behavior: Enable any number of audio sources labeled (output) or (input) (tested by removing all input or output sources, same behavior)
expected behavior a clear and concise description of what you expected to happen. screenpipe creates two files, input and output, one records loopback and one microphone
screenshots if applicable, add screenshots to help explain your problem.
system information:
additional context rtcqs output (i beleive most of these are Pop! os defaults. I know Pop! also uses pipewire-pulse as it's primary audio driver. I looked into troubleshooting a bit and found this: https://www.reddit.com/r/pop_os/comments/13yjaky/improve_jackpipewire_performance_on_pop_os/
and ran rtcqs before implementing any changes:
My experience with audio on linux has been 'finicky at best', having attempted to rig whisper.cpp in WSL once or twice and I think barely-succeeded, but it's been a while since I've been on my windows partition. As such, let me know if there's any particular audio device information that would be helpful for setting up sane linux defaults. I can test this on my Ubuntu machine later as well if that's helpful.