floooh / sokol

minimal cross-platform standalone C headers
https://floooh.github.io/sokol-html5
zlib License
6.63k stars 472 forks source link

How to sync graphics with audio? #917

Closed JakeCoxon closed 5 months ago

JakeCoxon commented 9 months ago

I'm generating audio data from within an audio stream callback. Is there a way to get the current audio frame being played?

For example if I make an envelope generator in the audio stream, can I sample this enevelope in the graphics thread at 60hz.

The audio is generated ahead of time so I believe I need two things: from within the audio stream callback, what frame position will this buffer be played at. And from within the graphics thread, what frame position is currently being played.

I may be wrong about this so feel free to point me in the right direction

floooh commented 9 months ago

You'll need to take care of this in yourself in the audio callback I'm afraid by copying the generated samples not only into the audio API buffer, but also into a memory location where the rendering code can read them.

Just be aware though that the audio callback usually runs in a separate thread, so you'll also need to make sure that the read and write pointers don't trample on each other's feet.

The audio callback is definitely the method to use (versus the 'push API') for getting the lowest latency out of sokol-audio.h.

I think the only way to reduce latency further between audio output and rendering is to experiment with smaller audio buffer sizes, there's no way to get the actual 'audio play cursor' from sokol-audio though (I'm not even sure if all backend APIs would provide such a feature).

PS: if it needs to work on the web, then reducing the audio buffer size to reduce latency might not be an option unfortunately, because the current WebAudio backend runs the streaming callback on the main thread (no Audio Worklet support yet).

JakeCoxon commented 9 months ago

Hi thanks for the quick response. That's okay about the copying - that's how I imagined it. I can wrap it in a mutex or something. I would like to run on web, but it's not a huge priority right now.

I'm not fully looking for an exact solution, just right now I do the naive thing and display the current value after the callback has ended, but it displays the same value for 6-7 frames before the audio updates again and it looks janky.

That's because the audio buffer could span across many video frames. E.g the audio buffer is 46ms and the video frametime is 7ms. Perhaps I can just record at enough points in the audio callback and in the display just interpolate through those values, will that be correct enough?

However my assumption was that even though the audio thread is calling my callback to append to a buffer, those samples aren't necessarily the next ones to be played - the buffer may still contain samples from the previous callback. If that's the case I can't really know when this callback's samples will be played. Maybe I can just assume they will be played immediately and it will look good enough.

Your other suggestion was to lower the buffer so it matches more closely with the video framerate. That does sound like a simpler solution, I'll have to look into that and see what the consequences are