Add support for real time audio streaming

bevyengine / bevy

A refreshingly simple data-driven game engine built in Rust

https://bevyengine.org

Apache License 2.0

35.01k stars 3.43k forks source link

Add support for real time audio streaming #5422

Open harudagondi opened 2 years ago

harudagondi commented 2 years ago

What problem does this solve or what need does it fill?

During the development of my plugin (bevy_fundsp, also shameless self promotion lol), I found that bevy_audio (and bevy_kira_audio, for that matter) is rather limited.

To play sounds, one must create an AudioSource, which stores the bytes of the sound data, then play it in a system using Res<Audio>.

This isn't feasible when using DSP (digital signal processing) libraries such as fundsp. DSP graphs can be played forever (see example here), so it would be impossible to convert these into AudioSources. A workaround for this is to pass a definite length and convert the graph into wave data bytes, which is then converted to an AudioSource.

This is very hacky, and it does not exploit the full capabilities of DSP libraries, especially fundsp.

What solution would you like?

I don't know what exact implementation would actually look like, but I would like:

A StreamingAudioSource that holds an iterator whose items is an array of floats (f32 ideally) where its length is the number of channels (usually two for left/right channels).
StreamingAudioSource can only be played (by continuing to iterate), paused (by stopping the iteration), or possibly reset (this is possible with fundsp specifically).

Using this solution would probably need access to cpal directly.

What alternative(s) have you considered?

Bypass bevy_audio and directly use cpal. This is bad, because audio programming is very hard, and it is better for Bevy to provide its own solution.

harudagondi commented 2 years ago

Turns out this is possible when using oddio. Specifically, StreamingAudioSource can be implemented when oddio::Signal is used. Currently, there is no bevy plugin that integrates oddio into bevy.

x-52 commented 2 years ago

Res<Audio> can actually play anything that implements the Decodable trait, which is simply a wrapper around the rodio::source::Source trait. Source is a supertrait of Iterator, but it lets you specify (important) metadata like the sample rate and the number of channels.

harudagondi commented 2 years ago

I have several problems with Decodable:

1) Decodable is a leaky abstraction. I have to import rodio myself, to get rodio::source::Source and if I have a graph that requires more than one channel, I'd have to import cpal to implement cpal::Sample. I think that people shouldn't have to fiddle with bevy_audio's inner workings just to integrate a synthesizer. 2) Decodable::Decoder requires Sync. In my usecase, DSP graphs are Send, not Sync. I don't think a StreamingAudioSource should be shared to different threads, as its main purpose is iteration (which requires either T or &mut T). 3) To play using Res<Audio>, Asset must also be implemented (which also requires Sync). Also, I personally don't think DSP graphs are data, but rather as an algorithm. I interpret assets as immutable data loaded asynchronously and can be changed on the fly by detecting changes in the filesystem (please correct me if I'm wrong). However, because I'm proposing a iterator of samples, it is inherently mutable and thus cannot be Sync.

However, I have been trying to implement Decodable to my types right now (P.S. it is very painful) so I'll get back you when I successfully implemented it.

x-52 commented 2 years ago

I interpret assets as immutable data loaded asynchronously and can be changed on the fly by detecting changes in the filesystem (please correct me if I'm wrong).

Assets can be loaded without interacting with the filesystem. (Just call Assets::add()). Also, you can get mutable access to an Asset through Assets::get_mut().

Decodable::Decoder requires Sync

Try wrapping your Source in an Arc<RwLock>. Implement Deref and DerefMut, and you'll have a Decoder that supports Sync.

I have to import rodio myself

I'd have to import cpal myself

That is pretty inconvenient. However, there's no point in reinventing the wheel when rodio and cpal already exist. One possible solution to the problems you've described would be re-exporting the relevant parts of both crates (like Source and Sample) and making it easier for people to implement Decodable and Asset.

I'll create a PR soon to add those things, but until then, keep working on implementing Decodable. It'll take a while until it gets merged.

harudagondi commented 2 years ago

Try wrapping your Source in an Arc<RwLock>. Implement Deref and DerefMut, and you'll have a Decoder that supports Sync.

Wrapping the Source in a RwLock or a Mutex is not feasible. Locking the mutex or the RwLock blocks the thread, which in audio programming is a no-no.

Here are the limitations in audio programming, according to this article:

1) Audio processing must be done in a separate thread. 2) The audio thread must NOT be blocked, or else you'll have underruns. (Mutex and RwLock blocks the thread when locking) 3) Memory must not be allocated or deallocated on the audio thread. (I'm not a system programming expert, but in my usecase I'm using trait objects, so they are definitely stored on the heap)

AudioSource is fine because for (2), it stores Arc<[u8]>, no mutexes or whatever, and for (3), AudioSource is essentially static, so it does not change in memory.

x-52 commented 2 years ago

Sorry for the delay! I haven't been able to use my main computer (which is like 20 times faster than the one I'm using right now) for a while, so I couldn't work on implementing that PR. I should have access to it tomorrow, so I can finally work on it then.

By the way, here's an interesting idea: What about using std::sync::mpsc to implement Sync? It would be complicated, but it's literally meant for sending data between threads without blocking - exactly what you wanted!

harudagondi commented 2 years ago

Sorry for the delay!

I don't mind. I've been implementing bevy_oddio in the meantime, which should cover my usecase (hopefully).

std::sync::mpsc

mpsc::Sender is !Sync.

mpsc::SyncSender is Sync, but it blocks.

The correct thing to use is a single producer single consumer lock-free ring buffer. However, I am currently adding support for bevy_oddio right now, and one of its traits need to take &self. I checked the source code and a lot of its signals uses RefCell since it doesn't block. This does mean that signals are Send but not Sync.

This leads to a problem that Asset cannot be implemented for streaming audio sources. This is because in oddio, Arc is used internally and therefore cannot use &mut self by default. Frankly I don't know why exactly it had to be that way, as kira uses &mut self in Sound. (Note that bevy_kira_audio has not yet resolved NiklasEi/bevy_kira_audio#63)

I am curious on how would bevy_kira_sound resolve this, as I am hammering ideas around on how would I integratebevy_oddio into bevy_fundsp.

x-52 commented 2 years ago

Have you looked at the external_source_external_thread example? In that example, you do the processing in a separate thread and use crossbeam_channel to send the data to Bevy. As an added bonus, it's Send, Sync, and non-blocking (on both sides)!

yjpark commented 8 months ago

Might be useful for someone, I've been update my bevy app to use audio streaming based on Decodable example (previously using an old fork of bevy_kira_audio since the streaming feature was dropped in later version). I am using ringbuffer for communication, which is actually tricky to use in Decodable, so I end up using some unsafe code to bypass the restriction.

The usecase is to use fluidlite to produce midi audio, the logic is a bit hacky, esp. around the sample rate (just hard-coded to 44100, and use buffer size to control the timing), but it works fine for now.