godotengine / godot

Godot Engine – Multi-platform 2D and 3D game engine
https://godotengine.org
MIT License
86.4k stars 19.24k forks source link

Producing events (such as playing audio) at constant intervals is jittery. #32382

Closed aleksfadini closed 1 year ago

aleksfadini commented 4 years ago

Godot version: 3.1.1

OS/device including version: Arch Linux, Surface Pro 6

Issue description: Producing events (such as playing audio) at constant/equal intervals is jittery. For instance, it is impossible to make a "decent" metronome in Godot with just a few lines of code as one would expect.

Steps to reproduce: Just tell the engine to reproduce a short audio file at high frequency (eg, every 5 delta). Instead of hearing a constant pulse of 16*5 ms, you will hear jittery things.

Minimal reproduction project: https://github.com/aleksfadini/godot-flawed-metronome

The code is just these 5 lines:

var counter = 0
func _physics_process(delta):
    counter += delta
    while counter >= 0.1:
        counter = 0
        $AudioStreamPlayer.play()
lawnjelly commented 4 years ago

To start with, your code introduces jitter itself. A slightly better version would be this:

var counter = 0
func _physics_process(delta):
    counter += delta
    while counter >= 0.1:
        counter -= 0.1
        $AudioStreamPlayer.play()

With your existing code, every time your counter goes past 0.1, it loses synchronization by ignoring the remaining time.

In addition, timing doesn't work quite the way you may think. At e.g. physics 60 ticks per second, Godot will attempt to call physics on average 60 times per second. However those ticks aren't guaranteed to occur at 1/60 second intervals.

Like nearly everything in a game engine, physics ticks aren't 'realtime'. This is a common misconception, thinking in human terms. A better word would be 'soft realtime'.

Game timing is really conceptually more like pre-rendering one of those pixar movies, where a render farm might take 10 mins to render each frame. Rather than doing the calculations in realtime, games roughly know when a frame is due to be shown, and try to do the calculations so that things appear in the right positions for the predicted time that the frame will be shown to the user. So think of them as a pre-rendered movie that tries to keep up with the frame rate.

Physics ticks in particular can be spaced widely apart, or bunched up together, as long as enough ticks have run that are 'due' for a particular frame. If you think about it there's no particular reason physics ticks need to run in realtime.

If a particular frame has a long duration for instance (due to a hard disk whirr or whatever), there will be no ticks during this gap, then when the next frame occurs, the physics will perform a lot of ticks to catch up, one immediately after the other.

girng commented 4 years ago

Why are you setting Engine.iterations_per_second=600?

girng commented 4 years ago

I'm looking at your example. I changed your code to:

extends Node2D
var counter = 0

func _physics_process(delta):
    counter += delta

    if counter > 0.1:
        $AudioStreamPlayer.play(0)
        counter = 0

I also can't reproduce the jitter from your previous code. I'm on Windows 10, 64-bit. Not arch linux though...

@lawnjelly I don't think those are issues in this case, cause the counter is increased by a delta. It's happening every 0.1s, regardless of frame-time

lawnjelly commented 4 years ago

In this case another problem is the 600 ticks per second as girng says. (I didn't download the project initially, just looked at your posted code).

The max number of physics ticks / frame is capped at 8, to prevent runaway physics. So you are probably getting random fluctuations due to this as well (you are losing time). If you change the tick rate to 60, you'll notice you'll get a high frequency of sounds.

girng commented 4 years ago

Eh, I take my previous statement back. You can actually tell the difference between each sound and the delay. It's so subtle and small though. I am wrong, this is reproducible.

edit: Going to try this with timers, _process, and another way. One second @lawnjelly is right, I think this delay is from the audio buffers on play. I mean you have to really listen for it, but your brain will catch the inconsistencies between sounds. It's so subtle though, but I mean, it's there

lawnjelly commented 4 years ago

You will also get jitter due to the audio buffers. In general this type of approach with high frequency sounds is likely to give less than perfect results.

girng commented 4 years ago

Also could be something with .wav? Going to try .ogg

edit: Here is the .ogg. The delay between sounds is much less and seems more consistent and less "jittery". But I mean, I don't think it's possible to have perfect timing when playing sound at such a high interval. However, objectively, the OP's issue is technically reproducible lol

lawnjelly commented 4 years ago

Although it is reproducible, I'm not sure any of this constitutes a 'bug', it's more a feature (that you will find in most game engines). Playing sounds in such a manner is confounding a cascade of different sources of jitter on top of each other. It would be more remarkable if the sounds didn't have significant jitter (something would have been really wrong)! :smile:

Jitter is something that needs to be actively dealt with right down to the innards of the rendering, you have to expect it will happen, and build strategies to deal with it.

In general I think if any of these is to be investigated further they need to be isolated to provide an example of something that could be considered a bug. In this issue there are several things to consider including:

1) Bug in original gdscript 2) Misunderstanding of non-realtime nature of physics ticks 3) Too high tick rate giving tick capping behaviour 4) Audio buffer delays (probably non constant) 5) Possible delays in playing compressed sounds

If the objective is to have regular game events, then it is better to investigate with non-audio methods. If you want to investigate audio, then I would suggest isolating that.

If the objective is to have closely timed audio (say a music app), then other approaches are far more appropriate, such as editing the raw audio.

Zylann commented 4 years ago

I've been thinking about that a few times, but would there be a way to "schedule" a sound to play at a specific time in the future (handled in the audio server since that's where samples get fed to the driver at 44100Hz, which is much more precise than physics or render frames) so that it actually starts playing exactly when it should? This way jitter in the scheduling code would almost never matter.

girng commented 4 years ago

I've been thinking about that a few times, but would there be a way to "schedule" a sound to play at a specific time in the future (handled in the audio server since that's where samples get fed to the driver at 44100Hz, which is much more precise than physics or render frames) so that it actually starts playing exactly when it should? This way jitter in the scheduling code would almost never matter.

Probably. But then again, the use-case for a working metronome to play smoothly at 0.1s.. is something reduz will most likely have a hard time accepting IMO. This is also the first time I've seen an issue regarding interval playback for audio. The delay is unnoticeable to me past a 0.3s interval

lawnjelly commented 4 years ago

I've been thinking about that a few times, but would there be a way to "schedule" a sound to play at a specific time in the future (handled in the audio server since that's where samples get fed to the driver at 44100Hz, which is much more precise than physics or render frames) so that it actually starts playing exactly when it should? This way jitter in the scheduling code would almost never matter.

Going a little off topic but I believe so, but it would depend on support for this in the Godot audio code. You could take account of latency from the audio buffers (which might be constant or variable with a tiled approach).

Realtime music apps have to deal with similar kinds of problem, trading off the latency with the danger of starving the audio player with realtime input from e.g. a midi keyboard, or scheduling notes in advance from sequencer. But typically in games for sound effects you can put up with something being 'good enough', whereas in music, or synchronizing a movie with audio more care is usually taken.

With such a system you could schedule sound effects like a sword clang to happen at the exact moment within an animation. But whether most users would notice I don't know.

It reminds me of an old trick back in the 80s, music samplers (think drums) were unable to play lots of notes at once without a delay between each (so they came out like fast machine gun fire). One of the solutions was to add varying amounts of silence to the start of each sample, and have them play in advance, so they all hit the start of the actual sound simultaneously. :smiley:

bojidar-bg commented 4 years ago

Note that AudioStreamPlayer::play has an optional parameter, from_position. While you cannot give that parameter a negative value, maybe you could try giving it the leftover time, so that it plays from the right position? Also, moving the code to _process should help with getting the code to execute closer to "realtime". Finally, using AudioStreamGenerator to drive the audio should be better, though I am not sure if it would be possible to read the stream which has to be repeated.

lawnjelly commented 4 years ago

Note that AudioStreamPlayer::play has an optional parameter, from_position. While you cannot give that parameter a negative value, maybe you could try giving it the leftover time, so that it plays from the right position?

There, you go, with that and an exact timer you could have a go at scheduling something reasonably accurately into the future with some silence tacked on the front. You'd still have to deal with audio buffer issues, but you could get around the vagaries of when the process or physics_process got called.

Zylann commented 4 years ago

@bojidar-bg given the name of that parameter, I thought that it was to start the sound from a different offset rather than just delaying it? (i.e playing a clip from its 42th sample rather than starting from 0th).

bojidar-bg commented 4 years ago

@Zylann Reread. I said the same?

Zylann commented 4 years ago

@bojidar-bg sorry it's quite ambiguous... so let's say I have a 3 second sound with 3 clicks in it. I thought that if I set from_position to 1 second, my sound would only play 2 clicks. Is that correct? Because it could as well mean that the sound will be delayed by 1 second instead and play 3 clicks. The doc doesn't explain which timeline this from_position applies to.

aleksfadini commented 4 years ago

Also could be something with .wav? Going to try .ogg

edit: Here is the .ogg. The delay between sounds is much less and seems more consistent and less "jittery". But I mean, I don't think it's possible to have perfect timing when playing sound at such a high interval. However, objectively, the OP's issue is technically reproducible lol

It also gets progressively worse if the cpu is busy or frames go even slightly below 60.

aleksfadini commented 4 years ago

Note that AudioStreamPlayer::play has an optional parameter, from_position. While you cannot give that parameter a negative value, maybe you could try giving it the leftover time, so that it plays from the right position?

There, you go, with that and an exact timer you could have a go at scheduling something reasonably accurately into the future with some silence tacked on the front. You'd still have to deal with audio buffer issues, but you could get around the vagaries of when the process or physics_process got called.

This should be marked as a work around, if you can add a buffer of silence to each audio file (let’s say a constant second) and then before playing them, check what is the random offset of the processing frame, let’s call it delay (in secs), and use $Sound.play(1-delay).

This is exactly what you suggested above.

However is a bit clunky and can get complicated when different pulses are supposed to overlap and meet at a certain time, because in that case even a small offset will be noticeable and perceived as a “non simultaneous “ event.

So I guess it would be nice if the engine would take care of it, as it is a potential problem for games that include rhythm, depending how the game mechanics work. Also, it is not the expected behavior of the engine. Another way to put it is: if you set up a timer mode for 0.25 secs and it plays a sound at each loop, that sounds very non constant, whenever FPS wobble even a bit.

So a component of this is definitely more of a big than a suggested feature in my opinion.

Edit: I wanted to thank everyone for jumping in and helping in this. I felt a bit alone and it is something that in a non rhythm game is not noticeable. Godot is great!!

aleksfadini commented 4 years ago

Why are you setting Engine.iterations_per_second=600?

Yeah, sorry about that, it should not be in the minimum viable project to show the bug. I tried different things to see if it improved the jitter, and forgot to delete that line. As you can see, increasing the cycles doesnt help.

aleksfadini commented 4 years ago

Note that AudioStreamPlayer::play has an optional parameter, from_position. While you cannot give that parameter a negative value, maybe you could try giving it the leftover time, so that it plays from the right position?

There, you go, with that and an exact timer you could have a go at scheduling something reasonably accurately into the future with some silence tacked on the front. You'd still have to deal with audio buffer issues, but you could get around the vagaries of when the process or physics_process got called.

By the way, here is a version with "buffers" of silence detected at each frame rate, yet it does not seem to sound constant if one taps a moderately high bpm (like 200). Especially after a while, things drift and become non regular. Anyone knows why? https://github.com/aleksfadini/metrollnome

lawnjelly commented 4 years ago

The hack with the silence at the start seems to be working reasonably well, it sounds pretty in sync at least at the starting tempo.

I haven't really examined the logic in your source (and there seemed to be a double note sounding at low bpms on mine) but one possibility for the weirdness at higher BPMs is actually bugs in the godot audio player, I reported one here #22016. You could try the workaround I found in that issue and see if it helps, but clearly it isn't bug free. Also you might want to try the latest 3.2 alpha version, it is possible some of these bugs could have been fixed.

Generally though for this kind of thing, ideally I would personally would be aiming to write directly to an audio stream (primary buffer) in c++, that way you can get timing bang on. But it looks as though you are getting reasonably close with these methods playing through godot API, this might be good enough for your purposes.

What might be nice for apps of this type is native ability to schedule a sound to be played in the future, i.e. allow the _from_position parameter to be a negative value. It is possible to use the silence trick with drums, but if you were playing anything at variable sample rates (e.g. instrument notes at different pitches) it becomes more complex to calculate the delay. You could try adding this to godot proposals, as it may be quite simple to implement, however you would need broad support and it is quite a niche use (unless you could argue using it for game sound effects too).

aleksfadini commented 4 years ago

I haven't really examined the logic in your source (and there seemed to be a double note sounding at low bpms on mine) but one possibility for the weirdness at higher BPMs is actually bugs in the godot audio player, I reported one here #22016. You could try the workaround I found in that issue and see if it helps, but clearly it isn't bug free. Also you might want to try the latest 3.2 alpha version, it is possible some of these bugs could have been fixed.

Things fall apart when I tap even only at 200 bpm. For some reason they sound fine at 120 bpm. I tried to use your workaround, but I get this error in the editor: Method StreamFromSType is not declared in the current class. I was trying to add your code in my main scene script (Node2d.gd), but I get the same error even when using it as an AudioStream script. Any ideas? I updated the github repository if you want to see where things are. My hope was to incorporate your code in my function "play_with_delay". It is also unclear to me if I can use "from position" in your work around.

lawnjelly commented 4 years ago

Ah yes it's a while since I wrote that, it contains some project specific stuff (the stype was the sound type), but the general gist is that instead of reusing an AudioStreamPlayer you create a new one each time you want to play a sound, attach it as a node to the scene graph, and make it so that when the sound has finished playing it destroys itself. I actually got the idea from a suggestion by Calinou in another forum. This is gross, but it worked around the bug that was causing glitches, and that same bug could conceivably cause timing abnormalities too. You may be able to help diagnose by using shorter sounds. Perhaps at the magic 200 bpm, you are getting more sounds overlapping.

That is just a wild hunch though, there could well be something else causing your issue at 200 bpm.

It may also simply be that there is a random variation in how long it takes the sound to start playing, due to audio tile buffering, and it is just more noticeable at higher BPMs. If you capture the waveform in audacity at different BPMs you can measure exactly the kind of variation you are getting and see whether it changes. Depending on the design of the audio system and the layers it depends on on different platforms (I haven't looked), it may not be possible to get this kind of accuracy.

Calinou commented 4 years ago

You can implement @lawnjelly's suggestion by using this autoload: https://github.com/Calinou/escape-space/blob/master/autoload/sound.gd

aleksfadini commented 4 years ago

Thank you everyone. I did implement it in that way, instancing an audiostream and queuing it free. It did improve things ... a little. If you launch the project now, at 350BPM, you can clearly here something "new": the jitteriness is clear but is periodic. You can see how it correlates to the cyclical "discrepancies" in the latency buffer I have created, in relation to how the frames fall across the timeline. I wonder if I could expand my code to include the latency of the audioserver through: AudioServer.get_time_to_next_mix() and AudioServer.get_output_latency() in order to finally achieve a decent result.

for your convenience, this is my repo again:

https://github.com/aleksfadini/metrollnome

Thank you for chiming in, it would be nice to have a solid work around and perhaps we could upload a metronome example to the Godot's assets store if we manage to get it working as it should.

lawnjelly commented 4 years ago

AudioServer.get_time_to_next_mix() and AudioServer.get_output_latency()

Ah, that sounds perfect for your use, new in 3.2 it looks like. :smile: get_time_to_next_mix() should give you the random fluctuation due to audio tiles I was mentioning, so if you can compensate for that, you should be able to get much better timing. https://docs.godotengine.org/en/latest/tutorials/audio/sync_with_audio.html

kerskuchen commented 4 years ago

A non-work-around solution requires the engine to support scheduled playback of audio samples like @Zylann suggested. Here is a link to the corresponding proposal including a non-trivial real use-case for this feature.

Edit: It seems that the company that made the Unity plugin set its showcase video on private. What it showed was a top down shooter game where the enemy animations and players shots (visual effects and audiosamples) were in sync to the beat of the music.

F-3r commented 4 years ago

I've faced this same issue and investigating a little bit I came to more or less the same conclusions (need a precise clock and ahead-of-time scheduling). It would be nice if this is implemented on the engine side. While investigating I found this paper written by Sam Aaron about the implementation of the SonicPI clock (a music livecoding environment). Perhaps has some interesting ideas https://www.cs.kent.ac.uk/people/staff/dao7/publ/farm14-aaron.pdf There's also the "tale of 2 clocks" https://www.html5rocks.com/en/tutorials/audio/scheduling/ which doesn't talk about implementation, but usage. Hope it helps and we can have precise audio scheduling in godot soon!

bluenote10 commented 4 years ago

Isn't it possible to avoid all these issues by writing a small wrapper around AudioStreamGenerator, which internally holds e.g. AudioStreamSample data with additional "time scheduling" information, and renders the raw samples exactly at the desired position into the audio output buffer. Basically how real sequencers/synths work.

kerskuchen commented 4 years ago

@bluenote10 If I understand it correctly from the documentation you linked this would indeed be a viable solution for scheduled playback of audio (and possibly solving the OPs problem?). But doesn't it also mean that one needs to re-implement things like i.e. AudioStreamPlayer2D if for example spatial audio is required?

Zylann commented 4 years ago

@kerskuchen if implemented with a stream, it is decoupled from spatialized playback, so I don't think you'd have to recode that.

kerskuchen commented 4 years ago

Ah @Zylann thanks for the clarification! So the AudioStreamPlayer2D could take a AudioStreamGenerator as its "input" stream because AudioStreamGenerator is itself an AudioStream. Thats nice and convenient :)

benjarmstrong commented 3 years ago

I believe issue 38215 is part of the problem. TLDR: The mixer can never mix at intervals below 1024 frames, so assuming your sample rate is 44100 then 1024 / 44100 = ~0.02321. This means sounds can only be timed at multiples of approximately 0.02321 seconds, which is not enough for precise musical timing.

aleksfadini commented 3 years ago

I believe issue 38215 is part of the problem. TLDR: The mixer can never mix at intervals below 1024 frames, so assuming your sample rate is 44100 then 1024 / 44100 = ~0.02321. This means sounds can only be timed at multiples of approximately 0.02321 seconds, which is not enough for precise musical timing.

Quite interesting. Not sure how this could be fixed.

benjarmstrong commented 3 years ago

I believe issue 38215 is part of the problem. TLDR: The mixer can never mix at intervals below 1024 frames, so assuming your sample rate is 44100 then 1024 / 44100 = ~0.02321. This means sounds can only be timed at multiples of approximately 0.02321 seconds, which is not enough for precise musical timing.

Quite interesting. Not sure how this could be fixed.

Fixed in PR 38280. I have a version of 3.2 with this fix backported for a music game (among other audio improvements)

Torguen commented 3 years ago

Ah @Zylann thanks for the clarification! So the AudioStreamPlayer2D could take a AudioStreamGenerator as its "input" stream because AudioStreamGenerator is itself an AudioStream. Thats nice and convenient :)

Hi, so that's the solution to play a perfect loop of multiple audio files?

aleksfadini commented 3 years ago

Depending on what you mean, there might be no solution. If by perfect loop you mean that they keep themselves in sync over time. Obviously, you can easily loop each file perfectly.

At this point, it depends on how precise you have to be. I tried for months all sorts of work-arounds and eventually had to use Unity for my rhythm application. We would need absolute time in Godot, as explained in https://github.com/godotengine/godot-proposals/issues/1151

Calinou commented 1 year ago

Closing in favor of https://github.com/godotengine/godot-proposals/issues/1151, as this requires new engine features to be implemented.

See also https://github.com/godotengine/godot/pull/63265, which may be useful in some situations.