andrewrk / libsoundio

C library for cross-platform real-time audio input and output
http://libsound.io/
MIT License
1.95k stars 230 forks source link

Time-code needed in read/write callbacks? #121

Open iskunk opened 8 years ago

iskunk commented 8 years ago

Currently, when you request a buffer-clear in the sio_sine example, you hear a brief pop---at least with the ALSA backend. This may or may not be due to a limitation in the backend, but it occurs to me that there may be a deeper problem:

If your buffer is filled with a sine wave, and you clear the buffer, then when you start generating the sine again... how do you know the correct offset to use so that the newly-generated sine wave is not phase-shifted from what was playing previously?

That pop you hear could very well be a waveform discontinuity due to the write callback not taking measures (or not being able to take measures) to ensure that the phase alignment is maintained.

The sine wave stands in for more complex audio, of course. What if you're generating an interactive musical score in a game? Most of the time you may be able to buffer a couple seconds ahead, but then a sudden change is needed, so you request a buffer-clear. Then, in the write callback, you generate the new score segment---but where exactly are you picking up from? What was the last frame to play? You can't generate a seamless musical score unless you know the exact boundary of that buffer-clear.

This is why I think libsoundio may need to present some kind of time-code with respect to the audio stream. Because resolution down to a single frame---but no smaller---would be needed, I think this time-code would better be expressed in terms of an integer number of frames rather than floating-point fractions of a second.

For example, soundio_outstream_begin_write() could indicate that the first frame about to be written will be frame number 1504176. If you clear the buffer, then you may get that you're about to write frame 1408176.

(A possible related feature that would be nice is to allow the write callback to read the buffer that was nominally "cleared," in case it wishes only to modify the already-buffered data rather than generate it again from scratch.)

ligfx commented 8 years ago

Can you explain your use-case a bit more? My experience with generating audio on the fly is that buffers are kept as small as possible, so that changes are "seamless" in the sense that you only have to wait a few fractions of a second for your buffers to naturally clear out. I've never seen an actual buffer clear being used outside of stopping something entirely.

SDL offers a SDL_GetQueuedAudioSize function that does something similar to what you describe, but it's meant for larger buffers of data (fire-and-forget sort of thing). Could this functionality be implemented by the end-user with their own ring buffer? i.e. buffer a few seconds ahead, pull small chunks of data in the write callback, and keep track of where you are in your larger buffer.

iskunk commented 8 years ago

This is not my use case. Rather, it's an issue I see in the current API design (the pop in sio_sine could arguably be called a bug) and its resolution will likely affect some other work I'm doing in the tree. Besides, if significant changes are needed to address it, then now (new major release) would be the time for 'em.

Whether the buffer is large or small actually wouldn't make a difference; as long as there's any buffer at all, clearing it implies that some number of frames are going to get dropped, and then you need to know that number if you want to avoid a discontinuity in the audio.

Libsoundio does have soundio_outstream_get_latency(), and that SDL function appears similar (except that it makes no attempt to account for hardware-level latency, which is important). But you can only call this from within the write callback---and you would need to know the value immediately before the buffer-clear, not after.

I don't think this functionality could be implemented by the end user, unless you're suggesting a "clear buffer" operation that doesn't occur immediately when requested. (The cursor in your ring buffer always represents a future frame, not the one that is playing right now.)

ligfx commented 8 years ago

This is not my use case. Rather, it's an issue I see in the current API design (the pop in sio_sine could arguably be called a bug) and its resolution will likely affect some other work I'm doing in the tree.

What's your use case? Like I said, I've never seen a buffer clear used in the wild outside of shutdown sequences. If this is something people do, I'd love to see what they're doing with it. (Also of note: the CoreAudio backend doesn't support clearing buffers).

I don't think this functionality could be implemented by the end user, unless you're suggesting a "clear buffer" operation that doesn't occur immediately when requested.

Yes, I'm suggesting a clear buffer operation that occurs a few fractions of a second after requested, depending on current latency.

iskunk commented 8 years ago

What's your use case?

My use case is actually much closer to the soundio_play() proposal :-] (LibAO expat here)

But as concerns this, see this bit in soundio.h:

    /// On systems that support clearing the buffer, this defaults to a large
    /// latency, potentially upwards of 2 seconds, with the understanding that
    /// you will call ::soundio_outstream_clear_buffer when you want to reduce
    /// the latency to 0. On systems that do not support clearing the buffer,
    /// this defaults to a reasonable lower latency value.

Seems like clearing the buffer is a thing you do when you're normally buffering a reasonable amount of audio, but then from time to time you need to change the stream right-now-immediately. (Which is a better way of working than keeping the buffer fill low, IMO, because otherwise you're forever dancing on the edge of an underrun.)

(Also of note: the CoreAudio backend doesn't support clearing buffers).

That's (unfortunate, but) covered:

/// Some backends do not support clearing the buffer. On these backends this
/// function will return SoundIoErrorIncompatibleBackend.
/// Some devices do not support clearing the buffer. On these devices this
/// function might return SoundIoErrorIncompatibleDevice.
/// Possible errors:
///
/// * #SoundIoErrorStreaming
/// * #SoundIoErrorIncompatibleBackend
/// * #SoundIoErrorIncompatibleDevice

Yes, I'm suggesting a clear buffer operation that occurs a few fractions of a second after requested, depending on current latency.

With as much emphasis as is placed on low-latency everything around here, I don't think that would pass muster...

ligfx commented 8 years ago

Seems like clearing the buffer is a thing you do when you're normally buffering a reasonable amount of audio, but then from time to time you need to change the stream right-now-immediately. (Which is a better way of working than keeping the buffer fill low, IMO, because otherwise you're forever dancing on the edge of an underrun.)

Okay, so in this case, what are you trying to seamlessly transition to? This sounds like a music player, where you're going to play a completely different audio track.

iskunk commented 8 years ago

Okay, so in this case, what are you trying to seamlessly transition to?

Again, I'm not transitioning to anything ^_^ I'm postulating the use case of an interactive musical score that is normally buffered ahead by (say) two seconds, but then suddenly needs to take a different turn. You see that kind of thing in games, and there may be composition software that likewise follows suit.

So you're generating what seems like a continuous piece of music all along, only that at a certain point it reacts to a sudden, completely unexpected input. No pops, no glitching anywhere along the way.

Granted, I don't know if that's even possible (assuming that clearing the buffer is itself doable). But that's the kind of thing that some professional user of libsoundio might be keen on, seems like.

This sounds like a music player, where you're going to play a completely different audio track.

That's another possibility. But it needs more qualification: You want the current track to stop playing not with a hard stop, but with a quick fade-out (or even a cross-fade into the new track) that begins the moment you press the button.

Yepoleb commented 8 years ago

I don't see the advantages of clearing the buffer instead of keeping it to a reasonably small size. For games with a constant stream of input events you'll be generating a big amount of useless samples and for applications like music players low latency isn't really a priority. Your example is probably somewhere in the middle and should work fine using the traditional approach.

iskunk commented 8 years ago

Hi @Yepoleb!

On further thought, @ligfx's example of a music player is in fact perfect for this discussion. Most of the time, low latency isn't a priority---quite the contrary, you want to load up the buffer as much as possible, so that the music keeps playing regardless of what blows up on the user's desktop. But then, if the user skips ahead to the next song, you want the player to react immediately, not in a half-second, not in two seconds, certainly not in however long the buffer normally is.

Ordinarily, you'd do a hard stop of the current song, and start the next one. But with libsoundio and the buffer-clear functionality, you can do something more polished, like a fast fade-out or cross-fade---that starts instantly when the >| button is pressed.

Keeping the buffer small during playback in anticipation of that not only doesn't get you instant feedback (you still have the small bit of latency of whatever buffer size you're using), it then makes the player much more prone to underruns altogether.

Yepoleb commented 8 years ago

I don't care if my music player takes 100ms to switch the current song and if the cross fade is instant. I pick the song and it plays for 3 to 7 minutes without interruption. Changing the song in a YouTube playlist can take up to a second and nobody complains about that.

Are underruns actually a problem with music players? For me they're usually the last thing that stops working when my PC freezes.

iskunk commented 8 years ago

I don't care if my music player takes 100ms to switch the current song and if the cross fade is instant.

You probably will care, however, if the music stutters when that 100ms buffer goes poof every time the system has a load spike.

As for taking 100ms to switch the current song---you may not care, but someone else will. The clear-buffer operation is meant to get the latency down to near-zero, not 100ms.

Are underruns actually a problem with music players?

Generally no, thanks to (generous) buffering.

Yepoleb commented 8 years ago

I think you're trying to solve an issue that doesn't exist. Music players work fine and nobody is complaining about high latency or stutter with them. Why do you want to change that?

andrewrk commented 7 years ago

I think there is definitely a problem here. I'm going to paste something I typed over in #117:


Let's talk about how various things should work, ideally. For these use cases, let's pretend that we're dealing with a large hardware buffer, 1 second long.

Use case 1: implementing seek to a different song in a music player.

In this use case let's assume that we're in the middle of playing one song and we want to immediately seek to a different one, with no gap in the audio. So we have 1 second of song A in the hardware buffer. What we actually want to do, I think, is:

In this use case if pausing the hardware also clears it, it still perfectly fits our use case.

Use case 2: implementing pause/resume in a music player.

Here let's say we're in the middle of playing a song, and the user presses pause, waits 10 seconds, and then presses resume. When the user presses pause, what we want to happen is pause the hardware, preserving the 1 second of buffered song. When the user resumes, what we want to happen is to resume the hardware. It will pick up right where it left off.

In this case, if pausing the hardware also clears it, our use case is broken.

For libsoundio, it is already true that not all backends support pausing, for example JACK.

In my experience, pause and resume in general have not been reliable to abstract over. If I were doing it again I might omit pause/resume from the API altogether and suggest that users disconnect/reconnect from the backend to accomplish this.

The downside to this is that these two use cases I outlined above become difficult to achieve correctly. How would you do these use cases in JACK, where there is no pause/resume? You would have to treat it as if any samples written to JACK were lost and would be played no matter what. So for the pause/resume music player use case, whatever your out stream latency is, that's how long it takes to pause. For the seek use case, same thing, your out stream latency determines how long it takes to seek.


For backends that support pause, resume, and clear buffer, libsoundio API should enable the use cases to operate ideally. I think the existing clear buffer implementation was not designed with this clarity of thought in mind, and I would like to revisit it.

Can anyone come up with another use case for clearing the hardware buffer that is not covered by these 2 examples?