mozilla / cubeb-pulse-rs

ISC License
17 stars 17 forks source link

cubeb-pulse-rs delivers samples at a higher frequency than requested #93

Closed mutexlox-signal closed 2 months ago

mutexlox-signal commented 2 months ago

Using cubeb-rs with cubeb-pulse-rs as a backend, I have written some code (excerpted below) that requests a recording sample rate of 48k Hz with a latency of 480 frames (so, the data callback should be invoked at 10ms intervals, with 480 frames each time). The underlying library I'm using, webrtc in chromium, is inflexible about this, and will trip debug assertions if given data at a different frequency than precisely 480 samples in each API call (when given a sample rate of 48,000).

On windows and mac, this more or less works fine with a bit of finagling (for example, on windows, a latency less than 22ms is not well-supported so I have to request a higher latency initially and then batch), but cubeb-pulse-rs shows weirder behavior.

Specifically, each invocation of the data callback happens at roughly 5ms intervals, with roughly (but not exactly) half of the number of samples I would expect. In practice I see anywhere from 142 to 288 samples in one invocation.

Is it expected that the latency requested will not always match the actual number of frames provided to the data callback / the latency at which the callback is invoked?

Code snippet:

const SAMPLE_FREQUENCY: u32 = 48_000;

// Windows hack: Need sample latency to be at most 22ms
#[cfg(target_os = "windows")]
const SAMPLE_LATENCY: u32 = SAMPLE_FREQUENCY / 10;
#[cfg(not(target_os = "windows"))]
const SAMPLE_LATENCY: u32 = SAMPLE_FREQUENCY / 100;

// WebRTC always expects to provide 10ms of samples at a time.
const WEBRTC_WINDOW: usize = SAMPLE_FREQUENCY as usize / 100;

const STREAM_FORMAT: cubeb::SampleFormat = cubeb::SampleFormat::S16NE;
const NUM_CHANNELS: u32 = 1;

// ...

pub fn init_recording(&mut self) -> i32 {
       // ...
        let params = cubeb::StreamParamsBuilder::new()
            .format(STREAM_FORMAT)
            .rate(SAMPLE_FREQUENCY)
            .channels(NUM_CHANNELS)
            .layout(cubeb::ChannelLayout::MONO)
            .prefs(StreamPrefs::VOICE)
            .take();
        let mut builder = cubeb::StreamBuilder::<Frame>::new();
        let transport = Arc::clone(&self.audio_transport);
        builder
            .name("ringrtc input")
            .input(recording_device, &params)
            .latency(SAMPLE_LATENCY)
            .data_callback(move |input, _| {
                let binding = input.iter().map(|f| f.m).collect::<Vec<_>>();
                // Some devices deliver samples at a fraction of the expected frequency,
                // but without missing data. However, webrtc cannot handle data arriving in more
                // 10-ms chunks. So, re-chunk the data.
                let input_chunks = binding.chunks(WEBRTC_WINDOW);
                for chunk in input_chunks {
                    let mut chunk_vec = chunk.to_vec();
                    if chunk_vec.len() < WEBRTC_WINDOW {
                        // TEMPORARY HACK: Pad with silence.
                        error!("padding with silence; len was {}", chunk_vec.len());
                        chunk_vec.resize(WEBRTC_WINDOW, 0);
                    }
                    let (ret, _new_mic_level) = AudioDeviceModule::recorded_data_is_available(
                        Arc::clone(&transport),
                        chunk_vec,
                        NUM_CHANNELS,
                        SAMPLE_FREQUENCY,
                        // TODO(mutexlox): do we need different values here?
                        Duration::new(0, 0),
                        0,
                        0,
                        false,
                        None,
                    );
                    if ret < 0 {
                        error!("Failed to report recorded data: {}", ret);
                        return ret as isize;
                    }
                }
                input.len() as isize
            })
            .state_callback(|state| {
                warn!("recording state: {:?}", state);
            });
       // ....
    }
}

Log excerpt from the log line shown in that code snippet:

[2024-09-19 15:15:22.223 ERROR Some("src/rust/src/webrtc/audio_device_module.rs"):Some(575)] padding with silence; len was 227
[2024-09-19 15:15:22.225 ERROR Some("src/rust/src/webrtc/audio_device_module.rs"):Some(575)] padding with silence; len was 142
[2024-09-19 15:15:22.230 ERROR Some("src/rust/src/webrtc/audio_device_module.rs"):Some(575)] padding with silence; len was 239
[2024-09-19 15:15:22.235 ERROR Some("src/rust/src/webrtc/audio_device_module.rs"):Some(575)] padding with silence; len was 237
[2024-09-19 15:15:22.240 ERROR Some("src/rust/src/webrtc/audio_device_module.rs"):Some(575)] padding with silence; len was 238
[2024-09-19 15:15:22.245 ERROR Some("src/rust/src/webrtc/audio_device_module.rs"):Some(575)] padding with silence; len was 239
[2024-09-19 15:15:22.250 ERROR Some("src/rust/src/webrtc/audio_device_module.rs"):Some(575)] padding with silence; len was 238
[2024-09-19 15:15:22.255 ERROR Some("src/rust/src/webrtc/audio_device_module.rs"):Some(575)] padding with silence; len was 237
[2024-09-19 15:15:22.261 ERROR Some("src/rust/src/webrtc/audio_device_module.rs"):Some(575)] padding with silence; len was 239
[2024-09-19 15:15:22.266 ERROR Some("src/rust/src/webrtc/audio_device_module.rs"):Some(575)] padding with silence; len was 238
[2024-09-19 15:15:22.271 ERROR Some("src/rust/src/webrtc/audio_device_module.rs"):Some(575)] padding with silence; len was 239
[2024-09-19 15:15:22.276 ERROR Some("src/rust/src/webrtc/audio_device_module.rs"):Some(575)] padding with silence; len was 237
[2024-09-19 15:15:22.281 ERROR Some("src/rust/src/webrtc/audio_device_module.rs"):Some(575)] padding with silence; len was 238
[2024-09-19 15:15:22.286 ERROR Some("src/rust/src/webrtc/audio_device_module.rs"):Some(575)] padding with silence; len was 239
[2024-09-19 15:15:22.291 ERROR Some("src/rust/src/webrtc/audio_device_module.rs"):Some(575)] padding with silence; len was 238
[2024-09-19 15:15:22.296 ERROR Some("src/rust/src/webrtc/audio_device_module.rs"):Some(575)] padding with silence; len was 237
[2024-09-19 15:15:22.301 ERROR Some("src/rust/src/webrtc/audio_device_module.rs"):Some(575)] padding with silence; len was 239
[2024-09-19 15:15:22.306 ERROR Some("src/rust/src/webrtc/audio_device_module.rs"):Some(575)] padding with silence; len was 238
[2024-09-19 15:15:22.311 ERROR Some("src/rust/src/webrtc/audio_device_module.rs"):Some(575)] padding with silence; len was 239
[2024-09-19 15:15:22.316 ERROR Some("src/rust/src/webrtc/audio_device_module.rs"):Some(575)] padding with silence; len was 237
[2024-09-19 15:15:22.321 ERROR Some("src/rust/src/webrtc/audio_device_module.rs"):Some(575)] padding with silence; len was 238
[2024-09-19 15:15:22.326 ERROR Some("src/rust/src/webrtc/audio_device_module.rs"):Some(575)] padding with silence; len was 239
[2024-09-19 15:15:22.331 ERROR Some("src/rust/src/webrtc/audio_device_module.rs"):Some(575)] padding with silence; len was 238
[2024-09-19 15:15:22.336 ERROR Some("src/rust/src/webrtc/audio_device_module.rs"):Some(575)] padding with silence; len was 237
[2024-09-19 15:15:22.340 ERROR Some("src/rust/src/webrtc/audio_device_module.rs"):Some(575)] padding with silence; len was 239
[2024-09-19 15:15:22.345 ERROR Some("src/rust/src/webrtc/audio_device_module.rs"):Some(575)] padding with silence; len was 238
[2024-09-19 15:15:22.350 ERROR Some("src/rust/src/webrtc/audio_device_module.rs"):Some(575)] padding with silence; len was 239
[2024-09-19 15:15:22.355 ERROR Some("src/rust/src/webrtc/audio_device_module.rs"):Some(575)] padding with silence; len was 240
[2024-09-19 15:15:22.360 ERROR Some("src/rust/src/webrtc/audio_device_module.rs"):Some(575)] padding with silence; len was 240
[2024-09-19 15:15:22.365 ERROR Some("src/rust/src/webrtc/audio_device_module.rs"):Some(575)] padding with silence; len was 240
[2024-09-19 15:15:22.370 ERROR Some("src/rust/src/webrtc/audio_device_module.rs"):Some(575)] padding with silence; len was 240
[2024-09-19 15:15:22.375 ERROR Some("src/rust/src/webrtc/audio_device_module.rs"):Some(575)] padding with silence; len was 240
[2024-09-19 15:15:22.380 ERROR Some("src/rust/src/webrtc/audio_device_module.rs"):Some(575)] padding with silence; len was 240
[2024-09-19 15:15:22.385 ERROR Some("src/rust/src/webrtc/audio_device_module.rs"):Some(575)] padding with silence; len was 240
[2024-09-19 15:15:22.389 ERROR Some("src/rust/src/webrtc/audio_device_module.rs"):Some(575)] padding with silence; len was 144
[2024-09-19 15:15:22.394 ERROR Some("src/rust/src/webrtc/audio_device_module.rs"):Some(575)] padding with silence; len was 240
kinetiknz commented 2 months ago

This is expected, the latency parameter is pretty much a hint only. PulseAudio also tends to show the widest variance of the backends we support.

FWIW, it's good practice to use Context::min_latency() rather than hard code latency values.

mutexlox-signal commented 1 month ago

Good to know, thank you!