Open ivanstepanovftw opened 1 month ago
You almost certainly want to preload small sound effects into the asset storage, rather than loading them on demand as you did here. That's likely to be the major source of latency here (and something our examples should more clearly teach).
That said, once that's done, can you try this with bevy_kira_audio
and let us know how it compares? We're considering swapping backends, and improvements here would be
Thank you for rapid response :)
Tried to preload audio into resource:
use bevy::prelude::*;
fn main() {
App::new()
.add_plugins(DefaultPlugins)
.add_systems(Startup, setup)
.add_systems(Update, signal)
.run();
}
#[derive(Resource)]
struct SFX {
collision_sound: Handle<AudioSource>,
}
fn setup(
mut commands: Commands,
asset_server: Res<AssetServer>,
) {
commands.insert_resource(SFX {
collision_sound: asset_server.load("sounds/breakout_collision.ogg"),
});
}
fn signal(
mut commands: Commands,
keyboard_input: Res<ButtonInput<KeyCode>>,
mouse_button_input: Res<ButtonInput<MouseButton>>,
sfx: Res<SFX>,
) {
if keyboard_input.just_pressed(KeyCode::Space) || mouse_button_input.just_released(MouseButton::Left) {
commands.spawn((
AudioBundle {
source: sfx.collision_sound.clone(),
settings: PlaybackSettings::DESPAWN
},
));
}
}
Cycle shown here is mouse press, then release, then observe for 2 clicks in VLC and 2 clicks in Bevy, so left part is VLC 2 clicks and right part is Bevy. Selected first click in Bevy with $597-484=113$ ms latency:
So, no improvement for preloading sound. Though I am not sure I've done it correctly (can I reference sound instead?).
Will check Kira in 10 minutes.
Okay thanks, that's a much more realistic setup. The code you've supplied looks correct to me.
Very weird to see that this didn't help and to see it be delayed substantially more than a frame. I'll ask around (hi @inodentry?) to get opinions from people with more expertise.
Kira example:
bevy = { version = "0.13.2", features = ["dynamic_linking", "mp3", "wav"] }
bevy_kira_audio = "0.19.0"
use bevy_kira_audio::prelude::*;
use bevy::prelude::*;
use bevy_kira_audio::AudioSource;
fn main() {
App::new()
.add_plugins((DefaultPlugins, AudioPlugin))
.add_systems(Startup, setup)
.add_systems(Update, signal)
.run();
}
#[derive(Resource)]
struct SFX {
collision_sound: Handle<AudioSource>,
}
fn setup(
mut commands: Commands,
asset_server: Res<AssetServer>,
) {
commands.insert_resource(SFX {
collision_sound: asset_server.load("sounds/breakout_collision.ogg"),
});
}
fn signal(
keyboard_input: Res<ButtonInput<KeyCode>>,
mouse_button_input: Res<ButtonInput<MouseButton>>,
audio: Res<Audio>,
sfx: Res<SFX>,
) {
if keyboard_input.just_pressed(KeyCode::Space) || mouse_button_input.just_released(MouseButton::Left) {
audio.play(sfx.collision_sound.clone());
}
}
Had to rollback Bevy to 0.13.2 Latency $117$ and $105$ ms
@ivanstepanovftw thank you very much for investigating this. I'm personally out of immediately actionable ideas, but for the sake of posterity what OS are you on?
It is Fedora Linux 40 (Workstation Edition)
$ uname -a
Linux fedora 6.8.11-300.fc40.x86_64 #1 SMP PREEMPT_DYNAMIC Mon May 27 14:53:33 UTC 2024 x86_64 GNU/Linux
System latency is defined by 3 things: the input latency, the audio processing latency, and the output latency.
The input latency is due to every single part of the transport process from the moment you physically closed a switch on the mouse integrated circuit to the moment the main thread forwards the play start event. Because you're testing both on the same system and with the same mouse (connected the same way) for fairness, the important factor here for input latency is how short is the duration between when the event gets dispatched into Bevy and the moment the right system reads the event. VLC goes not have a "game loop" and can directly dispatch events without having to wait. VLC is also highly-optimized software by the sheer nature of its age and usage (and I assume amount of contributions) so it would not surprise me if there was a custom routine to get it to react faster by doing some extra computer wizardry, and could in theory explain the 4x increase in latency.
The audio processing latency is mainly defined in terms of how much data the audio device requests the computer to process. Audio data is requested in chunks, at once, in a single callback, and so the length of the buffer determines how frequent the audio processing callback gets called, and with that, the inherent latency of the system (worst case scenario, you tell the audio engine to play right as it finishes processing a chunk of audio, and you have to wait the entire time until the next time the audio processing callback is called before you can hear the results of your change. This is directly determined by the buffer size and the sample rate, and your latency is directly determined because of that. VLC could be using a smaller buffer size and a higher sample rate than Bevy, which could by itself explain the 4x difference.
The output latency is determined by the inherent latency of your audio interface (but is constant in both runs, so shouldn't matter), and the latency of the native OS API you're hooking into (on Windows, for example, you have a choice of 3 (4 if you count ASIO) native APIs, all with various drawbacks and states of decay, and, of course, all different latency amounts; but not to worry, Linux is also kind of a mess with 3 (4 if you want to distinguish JACK and PipeWire)) active APIs that are all available to not break existing programs. This means that VLC could be choosing the right combination of API and OS settings, while Bevy takes whatever defaults comes to it, and aren't necessarily the best, and could very well explain the 4x increase in latency.
And my guess as to what's happening here? It's all of the above. Bevy doesn't have any heuristics in choosing an audio stream configuration, it takes the default it is given and uses it as-is. This could be changed, either by exposing some way of letting users choose their own configurations, or by integrating heuristics that reduce output latency (and needs both, IMHO). It's also the case that Bevy runs its systems only once per graphical frame, and so at 60 fps you'll have 16 ms of worst-case latency just between receiving the event in the main thread, and your system telling the audio engine to start playing. This, too, can be solved by implementing callback- or observer-based APIs to react to events as fast as possible instead.
All of this is assuming all audio data was ready to be used, and hardware setup was the same for both runs.
Tested more game engines for audio latency:
PyGame: 21, 15, 20, 16 ms LatencyPygame.zip (CPU%: 100)
SDL2 (chunk size 4096): 79, 78, 74, 99, 86, 100 ms LatencySDL2.zip (CPU%: 100)
SDL2 (chunk size 512 (as in PyGame)): 25, 21, 17 ms LatencySDL2 (512).zip (CPU%: 100)
Unreal Engine 5 (Editor): 112, 115, 114 ms [3.8 GiB project...]
Unity: 50, 95, 99, 116, 101, 41, 113 ms LatencyUnity.zip
Unity (set Project Setting | Audio | DSP Buffer Size to Best Latency): 45, 45, 46 ms
Godot 4 (Audio | Device | Output Latency is 15, Editor): 59, 55 ms LatencyGodot.zip
Godot 4 (Audio | Device | Output Latency is 15, Linux\x11, no debug): 52, 45 ms See above
Godot 4 (Audio | Device | Output Latency is 1, Editor): 21, 35, 23, 34, 23, 22 ms See above
macroquad: 150, 149 ms https://github.com/not-fl3/macroquad/blob/858f1108002bd5b858d43d6a3b5111236203c1b6/examples/audio.rs
notan: 90, 100, 107, 84 ms https://github.com/Nazariglez/notan/blob/a6ca3afdd5877658fd3f4daa50afaf4ba4933f31/examples/audio_basic.rs
raylib: 48, 55, 54 ms https://github.com/raysan5/raylib/blob/dcf2f6a8e97911c90efce5722bd7f0c7cdc8601e/examples/audio/audio_sound_multi.c
And in games: osu!lazer: 55, 54 ms https://github.com/ppy/osu
Apps: VLC: 35 ms
Chrome: 75, 79, 81 ms https://music.youtube.com/
Bevy: 113 ms
I'd be really curious to see those numbers for Unity/Unreal/Godot as well. This is an extremely informative investigation: I'd love to have a way to measure this in an automatable way.
I have added much more benchmarks to the previous message. I have discovered, that PyGame have lowest audio latency, faster than VLC baseline. Interesting!
PyGame uses SDL2 with 512 chunk size in Mix_OpenAudio
.
I have got 14, 16 ms with chunk size 1.
Bevy + SDL2 Mixer = 38 ms, with chunk size 256. With chunk size 1 it is 19 ms.
use bevy::prelude::*;
use sdl2::mixer::{InitFlag, AUDIO_S16LSB, DEFAULT_CHANNELS};
pub const MIXER_CHUNKSIZE: i32 = 256;
fn main() {
// Initialize SDL2 and SDL2_mixer
let sdl_context = sdl2::init().unwrap();
let _audio_subsystem = sdl_context.audio().unwrap();
sdl2::mixer::open_audio(44100, AUDIO_S16LSB, DEFAULT_CHANNELS, MIXER_CHUNKSIZE).unwrap();
sdl2::mixer::init(InitFlag::OGG).unwrap();
sdl2::mixer::allocate_channels(2);
// Load sound effect
let sound = sdl2::mixer::Chunk::from_file("assets/sounds/breakout_collision.ogg").unwrap();
App::new()
.add_plugins(DefaultPlugins)
.insert_non_send_resource(SdlAudio {
sound
})
.add_systems(Update, signal)
.run();
}
struct SdlAudio {
sound: sdl2::mixer::Chunk,
}
fn signal(
keyboard_input: Res<ButtonInput<KeyCode>>,
mouse_button_input: Res<ButtonInput<MouseButton>>,
sdl_audio: NonSend<SdlAudio>,
) {
if keyboard_input.just_pressed(KeyCode::Space) || mouse_button_input.just_released(MouseButton::Left) {
sdl2::mixer::Channel::all().play(&sdl_audio.sound, 0).unwrap();
}
}
Okay, so this implies that the majority of our latency is coming from the Rust audio stack, not any of a Bevy's architecture choices, correct?
Not what I would have expected: thank you for measuring this.
I have tried to specify buffer size manually, but unfortunately got 150 ms latency:
./crates/bevy_audio/src/audio_output.rs
:
impl Default for AudioOutput {
fn default() -> Self {
let Some(default_device) = cpal::default_host().default_output_device() else {
warn!("No default audio device found.");
return Self {
stream_handle: None,
};
};
let default_config = default_device.default_output_config().unwrap();
let default_config = SupportedStreamConfig::new(
default_config.channels(),
default_config.sample_rate(),
cpal::SupportedBufferSize::Range {
min: 1,
max: 1,
},
default_config.sample_format()
);
let default_stream = OutputStream::try_from_device_config(&default_device, default_config);
if let Ok((stream, stream_handle)) = default_stream {
// We leak `OutputStream` to prevent the audio from stopping.
std::mem::forget(stream);
Self {
stream_handle: Some(stream_handle),
}
} else {
warn!("No audio device found.");
return Self {
stream_handle: None,
};
}
}
}
Just tested Godot again, but setting Project Settings | Audio | Device | Output Latency to 1, got 21, 35, 23, 34, 23, 22 ms.
Bevy version
main aaccbe88aa0d591c9c741f690ab472785c7bac09
[Optional] Relevant system information
What you did
Basic audio playing on Space press or Mouse1 release:
Then loopback desktop audio to microphone.
Pressed record button in Audacity. Measure latency in VLC program from click to sound by selecting range in Audacity:![image](https://github.com/bevyengine/bevy/assets/8203898/4adce3d0-2a41-4c0e-98f2-a007fe032322)
Then latency can be then calculated from selection view below in Audacity: $.498-.463=.035$ s.
Repeat for Bevy example:![image](https://github.com/bevyengine/bevy/assets/8203898/a80e1322-8eb9-4fdf-9b6b-953d3a40ddd5)
$.639-.495=.144$ s.
What went wrong
Additional information
--release
flag with no improvement.No improvement.