What would it take to make WebAssembly audio inputs happen?

austintheriot commented 12 months ago

Hello! First off, thanks for your work on this library! cpal has been incredibly useful to me as a web/Rust dev.

Currently, there are // TODOs in place of all input-related coded for the WebAudio host: https://github.com/RustAudio/cpal/blob/76b04065cf189c1e04a15816657d1108759c3665/src/host/webaudio/mod.rs#L449

I'm wondering, what would it take to make input available on the web? Fundamentally, there appears to be an API mismatch in that cpal requires synchronous configuration, whereas the web requires async setup via getUserMedia(). Would it be possible to force a synchronous API here, where the input is basically just a plain buffer that you can write into via a media stream that is set up after the fact?

I'm interested, because I'm working on a cross-platform audio node library, and I'd like to support web compilation as a high-priority compilation target:

Code: https://github.com/austintheriot/resonix Demo: https://austintheriot.github.io/resonix/

So far, I've been working with pre-recorded audio primarily, but I'd love to support microphone access directly.

Would it make more sense to create this type of API user-side rather than library-side? I suppose I could potentially write microphone data to a plain buffer and surface that as if it were an audio node, but I'm not sure if the web audio API gives raw audio data like that or not.

steckes commented 4 months ago

Also waiting for audio input :)

Tahinli commented 4 months ago

Waiting for audio input :/

oligamiq commented 1 month ago

This is a code I created for my own use as a trial. I have confirmed that it works with egui, and I can retrieve data using try_recv for Vec<f32>. I don't have time to implement it with cpal, but please use it as a reference. I haven't refined the details.

code

```rust // MIT License // Copyright (c) oligamiq 2024 // Permission is hereby granted, free of charge, to any person obtaining a copy // of this software to deal in the Software without restriction, subject to the // following conditions: // THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND. use super::RawDataStreamLayer; use egui::mutex::Mutex; use std::sync::{Arc, OnceLock}; use wasm_bindgen::{prelude::Closure, JsCast as _}; use wasm_bindgen_futures::JsFuture; use web_sys::BlobPropertyBag; // https://zenn.dev/tetter/articles/web-realtime-audio-processing // https://qiita.com/okaxaki/items/c807bdfe3e96d6ef7960 pub struct WebAudioStream(pub Arc>>); impl WebAudioStream { pub fn new() -> Self { Self(WEB_INPUT.get().unwrap().clone()) } } impl RawDataStreamLayer for WebAudioStream { fn try_recv(&mut self) -> Option> { // this is -1.0 to 1.0 let mut data = self.0.lock(); if data.is_empty() { return None; } let data = data.drain(..).collect(); Some(data) } fn sample_rate(&self) -> u32 { SAMPLE_RATE.get().unwrap().clone() } fn start(&mut self) { // Do nothing } } static WEB_INPUT: OnceLock>>> = OnceLock::new(); static SAMPLE_RATE: OnceLock = OnceLock::new(); pub async fn init_on_web_struct() { let on_web = OnWebStruct::new().await; let vec = on_web.data.clone(); WEB_INPUT.get_or_init(|| vec); let sample_rate = on_web.sample_rate.unwrap(); SAMPLE_RATE.get_or_init(|| sample_rate as u32); core::mem::forget(on_web); } pub struct OnWebStruct { pub data: Arc>>, sample_rate: Option, _audio_ctx: web_sys::AudioContext, _source: web_sys::MediaStreamAudioSourceNode, _media_devices: web_sys::MediaDevices, _stream: web_sys::MediaStream, _js_closure: Closure, _worklet_node: web_sys::AudioWorkletNode, } impl std::fmt::Debug for OnWebStruct { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { write!(f, "OnWebStruct") } } impl OnWebStruct { pub async fn new() -> Self { let audio_ctx = web_sys::AudioContext::new().unwrap(); let sample_rate = audio_ctx.sample_rate(); let media_devices = web_sys::window() .unwrap() .navigator() .media_devices() .unwrap(); let constraints = web_sys::MediaStreamConstraints::new(); let js_true = wasm_bindgen::JsValue::from(true); constraints.set_audio(&js_true); let stream = media_devices .get_user_media_with_constraints(&constraints) .unwrap(); let stream = JsFuture::from(stream).await.unwrap(); let stream = stream.dyn_into::().unwrap(); let source = audio_ctx.create_media_stream_source(&stream).unwrap(); // 明示的にresumeを呼ぶ JsFuture::from(audio_ctx.resume().unwrap()).await.unwrap(); // Return about Float32Array // return first input's first channel's samples // https://developer.mozilla.org/ja/docs/Web/API/AudioWorkletProcessor/process let processor_js_code = r#" class MyProcessor extends AudioWorkletProcessor { process(inputs, outputs, parameters) { this.port.postMessage(Float32Array.from(inputs[0][0])); return true; } } registerProcessor('my-processor', MyProcessor); console.log('MyProcessor is registered'); "#; let blob_parts = js_sys::Array::new(); blob_parts.push(&wasm_bindgen::JsValue::from_str(processor_js_code)); let type_: BlobPropertyBag = BlobPropertyBag::new(); type_.set_type("application/javascript"); let blob = web_sys::Blob::new_with_str_sequence_and_options(&blob_parts, &type_).unwrap(); let url = web_sys::Url::create_object_url_with_blob(&blob).unwrap(); let processor = audio_ctx .audio_worklet() .expect("Failed to get audio worklet") .add_module(&url) .unwrap(); JsFuture::from(processor).await.unwrap(); web_sys::Url::revoke_object_url(&url).unwrap(); let worklet_node = web_sys::AudioWorkletNode::new(&audio_ctx, "my-processor") .expect("Failed to create audio worklet node"); source.connect_with_audio_node(&worklet_node).unwrap(); let data = Arc::new(Mutex::new(Vec::new())); let data_clone = data.clone(); // Float32Array let js_closure = Closure::wrap(Box::new(move |msg: wasm_bindgen::JsValue| { let msg_event = msg.dyn_into::().unwrap(); let data = msg_event.data(); let data: Vec = serde_wasm_bindgen::from_value(data).unwrap(); let mut data_clone = data_clone.lock(); data_clone.extend(data); }) as Box); let js_func = js_closure.as_ref().unchecked_ref(); worklet_node .port() .expect("Failed to get port") .set_onmessage(Some(js_func)); OnWebStruct { data, sample_rate: Some(sample_rate), _audio_ctx: audio_ctx, _source: source, _media_devices: media_devices, _stream: stream, _js_closure: js_closure, _worklet_node: worklet_node, } } } impl Drop for OnWebStruct { fn drop(&mut self) { let _ = self._audio_ctx.close(); } } ``` ```toml wasm-bindgen = "0.2.95" serde-wasm-bindgen = "0.6" wasm-bindgen-futures = "0.4" web-sys = { version = "0.3.72", features = [ "AudioContext", "MediaDevices", "MediaStreamConstraints", "MediaStream", "MediaStreamAudioSourceNode", "AudioWorklet", "AudioWorkletNode", "BaseAudioContext", "MessagePort", "MediaStreamAudioDestinationNode", "MessageEvent", "console", "Url", "BlobPropertyBag", "Blob" ] } js-sys = { version = "0.3" } ```

RustAudio / cpal

What would it take to make WebAssembly audio inputs happen? #813