orottier / web-audio-api-rs

A Rust implementation of the Web Audio API, for use in non-browser contexts
https://docs.rs/web-audio-api/
MIT License
286 stars 16 forks source link

Difficulty Specifying Media Types and Streaming in Rust MediaRecorder Implementation #506

Open Psionyc opened 4 months ago

Psionyc commented 4 months ago

Hello,

I'm transitioning from JavaScript to Rust for handling media streams and I'm facing some challenges particularly with the MediaRecorder implementation.

Background

In JavaScript, I was able to specify the codec and media type directly in the MediaRecorder constructor like this:

const mediaRecorder = new MediaRecorder(this.stream, { mimeType: 'audio/webm; codecs=opus' });

This made it clear what type of data was being sent out.

Issue in Rust

In Rust, however, the API does not seem to offer a way to specify the mime type directly when initializing the MediaRecorder. Here's how I'm currently setting it up:

let stream = get_user_media_sync(MediaStreamConstraints::AudioWithConstraints(constraints));
let media_recorder = MediaRecorder::new(&stream);

I find it difficult to know what type of data is being sent out due to the lack of this specification capability.

Attempt at Streaming Audio Data to a Frontend

To handle streaming, I replicated the setup I had in JavaScript, but I'm unsure if I'm handling the stream data correctly in Rust. Here is the Rust code snippet for streaming:

media_recorder.set_ondataavailable({
    let ws_stream = Arc::clone(&ws_stream_sync);
    move |e| {
        let recording_data = recording_data.clone();
        let ws_stream = ws_stream.clone();
        // recording_data.lock().unwrap().extend(e.blob.clone());
        let _ = ws_stream.lock().unwrap().send(Message::Binary(e.blob));
    }
});

For comparison, this is the JavaScript implementation that worked:

this.mediaRecorder.ondataavailable = event => {
    if (event.data.size > 0 && this.webSocket && this.webSocket.readyState === WebSocket.OPEN) {
        this.webSocket.send(event.data);
    }
};

Questions

  1. How can I specify the mime type for MediaRecorder in Rust similar to how it's done in JavaScript?
  2. Is there a better way to handle the data available event in Rust to ensure that the media data is streamed correctly to the frontend?

Any guidance or suggestions on how to handle these issues would be greatly appreciated!

Thank you.

orottier commented 4 months ago

Hey @Psionyc, great to hear you are trying things out!

Regarding the mime-type, we only support WAV at the moment. I realize this may not be very clear because it is written in the fine print of https://docs.rs/web-audio-api/latest/web_audio_api/media_recorder/struct.MediaRecorder.html#method.new . We don't prioritize adding new mime types at the moment but we will get to that at a later stage. I'm open for PRs though.

Regaring your 2nd question, your code looks reasonable - provided that ws_stream_sync is something like an Arc<Mutex<WebSocket>>. You have not shown a lot of code, what is your exact problem? Is the rust code something you are running server side or is this supposed to be client side WASM?

orottier commented 4 months ago

Also, you will likely run into the following bug: https://github.com/orottier/web-audio-api-rs/issues/404 - have a look because a workaround is provided. This library is aiming to replicate the Web Audio API. As you notice we also provide shims of the MediaDevices API and the MediaRecorder API, but they are not meant to be inter-linked directly

Psionyc commented 4 months ago

@orottier, I appreciate your feedback. For my tests, I used the webm format because it's commonly used in web environments. I'm exploring a streaming service that sends data via web sockets. Unfortunately, I encountered a #404 error, resulting in an infinite recording with poor audio quality. Based on your suggestions, I'll make adjustments and report back with the results and my code. Thank you!

Psionyc commented 4 months ago

Below is the code in it's entirety if you want to go through it 🤔

use std::sync::{Arc, Mutex};

use serde::{Deserialize, Serialize};

use tungstenite::{connect, Message};

use tauri::{Manager, Runtime};
use web_audio_api::{
    context::AudioContext,
    media_devices::{
        get_user_media_sync, MediaDeviceInfo, MediaDeviceInfoKind, MediaStreamConstraints,
        MediaTrackConstraints,
    },
    media_recorder::MediaRecorder,
};

pub struct AudioRecorder {
    is_recording: bool,
    is_streaming: bool,
    recording_data: Arc<Mutex<Vec<u8>>>,
    media_recorder: Arc<Mutex<Option<MediaRecorder>>>,
}

#[derive(Serialize, Deserialize)]
pub struct Device {
    name: String,
    id: String,
}

#[derive(Serialize, Deserialize)]
pub struct GetAudioDevices {
    input_devices: Vec<Device>,
    output_devices: Vec<Device>,
}

#[derive(Serialize, Deserialize, Clone)]
pub struct FinishedRecordingPayload {
    blob: Vec<u8>,
}

impl AudioRecorder {
    pub fn new() -> Self {
        return Self {
            is_recording: false,
            is_streaming: false,
            recording_data: Arc::new(Mutex::new(Vec::new())),
            media_recorder: Arc::new(Mutex::new(None)),
        };
    }
    pub async fn start_streaming<R: Runtime>(
        &mut self,
        id: String,
        app: tauri::AppHandle<R>,
    ) -> Result<(), String> {
        self.is_recording = true;
        self.is_streaming = true;

        let audio_devices = web_audio_api::media_devices::enumerate_devices_sync();

        let mut audio_device: Option<MediaDeviceInfo> = None;

        println!("Beginning connectiton");

        let (ws_stream, _) =
            connect("ws://127.0.0.1:3000/test").expect("Failed to connect to websocket");

        let ws_stream_sync = Arc::new(Mutex::new(ws_stream));

        println!("Connected successfuly");

        for device in audio_devices {
            if device.label() == id || device.device_id() == id {
                audio_device = Some(device);
                break;
            }
        }

        let mut constraints = MediaTrackConstraints::default();

        if let Some(device) = audio_device {
            constraints.device_id = Some(device.device_id().to_string());
        }

        let stream = get_user_media_sync(MediaStreamConstraints::AudioWithConstraints(constraints));

        let media_recorder = MediaRecorder::new(&stream);

        let _audio_ctx = AudioContext::default();

        // let gain = audio_ctx.create_gain();

        let recording_data = Arc::clone(&self.recording_data);

        media_recorder.set_ondataavailable({
            let ws_stream = Arc::clone(&ws_stream_sync);

            move |e| {

                let recording_data = recording_data.clone();
                let ws_stream = ws_stream.clone();
                // recording_data.lock().unwrap().extend(e.blob.clone());
                let _ = ws_stream.lock().unwrap().send(Message::Binary(e.blob));
            }
        });

        println!("Data_available set");

        // let _recording_data = Arc::clone(&self.recording_data);

        // let ws_stream = Arc::clone(&ws_stream);

        media_recorder.set_onstop(move |_| {
            // let _ = app.emit_all(
            //     "finished-recording",
            //     FinishedRecordingPayload {
            //         blob: vec![1, 2, 3],
            //     },
            // );

            let ws_stream_inner = Arc::clone(&ws_stream_sync);

            let _ = ws_stream_inner.lock().unwrap().close(None);
            println!("Connection to server closed");
        });

        println!("Stop set");

        media_recorder.start();

        println!("Started media recorder");

        let mr = Arc::clone(&self.media_recorder);

        mr.lock().unwrap().replace(media_recorder);

        Ok(())
    }

    pub fn get_audio_devices(&mut self) -> String {
        let audio_devices = web_audio_api::media_devices::enumerate_devices_sync();
        let mut output_audio_devices: Vec<Device> = Vec::new();
        let mut input_audio_devices: Vec<Device> = Vec::new();

        for audio_device in audio_devices {
            match audio_device.kind() {
                MediaDeviceInfoKind::AudioInput => input_audio_devices.push(Device {
                    name: audio_device.label().to_string(),
                    id: audio_device.device_id().to_string(),
                }),
                MediaDeviceInfoKind::AudioOutput => output_audio_devices.push(Device {
                    name: audio_device.label().to_string(),
                    id: audio_device.device_id().to_string(),
                }),
                _ => (),
            }
        }

        return serde_json::to_string(&GetAudioDevices {
            input_devices: input_audio_devices,
            output_devices: output_audio_devices,
        })
        .expect("Failed to get devices");
    }

    pub fn stop_recording(&mut self) -> Result<(), String> {
        let mr = Arc::clone(&self.media_recorder);
        if self.is_recording {
            if let Some(media_recorder) = mr.lock().unwrap().take() {
                media_recorder.stop();

            }
        }

        if let Some(media_recorder) = self.media_recorder.lock().unwrap().take() {

        }

        Ok(())
    }
}
orottier commented 4 months ago

The code looks fine and if it works, it works!

I hope you can get away with the WAV samples for now. I will write an issue to support MediaRecorder options (mime, codec, etc) from https://developer.mozilla.org/en-US/docs/Web/API/MediaRecorder/MediaRecorder#options