Clarify snd_pcm_hw_params_set_drain_silence documentation

alsa-project / alsa-lib

The Advanced Linux Sound Architecture (ALSA) - library

GNU Lesser General Public License v2.1

344 stars 173 forks source link

Clarify snd_pcm_hw_params_set_drain_silence documentation #378

Open z-s-e opened 5 months ago

z-s-e commented 5 months ago

The API documentation states that drain silence is enabled by default. But since this is a hardware parameter, is it possible that some hardware does not support this? Or is this always available (and will cause alsa-lib internal software workarounds for hardware that doesn't have perfect drain)?

Also, the documentation mentions the sw_params silencing mechanism as usable for a manual emulation of this, but the documentation of the sw_params silence_size/silence_threshold imply they only affect the silencing behavior in case of underrun, not in case of draining.

I can create a patch for the documentation if you'd like, as soon as I get the needed info.

perexg commented 5 months ago

For snd_pcm_hw_params_set_drain_silence(0), the silencing the rest of period (and piece of next one for most hw) is not active. The application is responsible to fill this (using standard I/O and rewind - may be problematic for hw which does not support rewind) or using the sw_params silence parameters (the silencing of the buffer is done imediatelly in the driver when new sw_params are set).

Also note that caller may check if hw support "perfect" drain snd_pcm_hw_params_is_perfect_drain. In this case, silencing is not necessary.

z-s-e commented 5 months ago

Yes, I understood so far. But the default setting is snd_pcm_hw_params_set_drain_silence(1) - does that mean the alsa lib will internally do basically these manual workaround steps you mentioned for you? I.e. as long as one does not change this setting, can I assume draining will work without any glitches at the end for all hardware, independently of the sw_params silence params? I think the documentation should be a bit more explicit about this.

And the sw_params silence params documentation only states that these influence the underrun behavior - unless you count draining as a kind of underrun, shouldn't that be orthogonal then?

PS: Side question, I came up with a potentially easier method of preventing glitches after drain manually: fill the whole hardware buffer with silence with a blocking write (which should only be able to return once the last bit of actual audio was played back) and then just do snd_pcm_drop. That then would also work for hardware without rewinding. What do you think?

perexg commented 5 months ago

can I assume draining will work without any glitches at the end for all hardware, independently of the sw_params silence params? I think the documentation should be a bit more explicit about this.

That's the goal for the default settings. Most applications don't care and calls snd_pcm_drain() blindly.

And the sw_params silence params documentation only states that these influence the underrun behavior - unless you count draining as a kind of underrun, shouldn't that be orthogonal then?

It's basically same and not. It depends on view. Silence just add requested number of samples. The drain is just a special case and it depends on the application settings, if this silence length is enough (period + something) for all hardware variants.

With ssd_pcm_hw_params_set_drain_silence(1) the API tries to handle the drain safely (and it's the default).

PS: Side question, I came up with a potentially easier method of preventing glitches after drain manually: fill the whole hardware buffer with silence with a blocking write (which should only be able to return once the last bit of actual audio was played back) and then just do snd_pcm_drop. That then would also work for hardware without rewinding. What do you think?

You will get extra buffer length latency at the end. Usually, the drain latency is up to one period.

z-s-e commented 5 months ago

can I assume draining will work without any glitches at the end for all hardware, independently of the sw_params silence params? I think the documentation should be a bit more explicit about this.

That's the goal for the default settings. Most applications don't care and calls snd_pcm_drain() blindly.

Ok, thanks. I suppose that means if I get glitches with that option enabled, I can report it as a bug. I suspect there may be one, where rewound samples are not properly zero'ed out, but I'll have to investigate more before I can really say.

And the sw_params silence params documentation only states that these influence the underrun behavior - unless you count draining as a kind of underrun, shouldn't that be orthogonal then?

It's basically same and not. It depends on view. Silence just add requested number of samples. The drain is just a special case and it depends on the application settings, if this silence length is enough (period + something) for all hardware variants.

Afaiu from the documentation, silence is only added when an underrun is closer than silence_threshold frames. If draining is not considered an underrun, it shouldn't usually add any silence for draining, no?

Or are you talking about the "special case" setting of silence_threshold == 0 and silence_size >= boundary_size, where samples are zero'ed once they are played back. It makes sense that this particular setting would also influence the drain behavior, but this setting is not the default, right?

With ssd_pcm_hw_params_set_drain_silence(1) the API tries to handle the drain safely (and it's the default).

PS: Side question, I came up with a potentially easier method of preventing glitches after drain manually: fill the whole hardware buffer with silence with a blocking write (which should only be able to return once the last bit of actual audio was played back) and then just do snd_pcm_drop. That then would also work for hardware without rewinding. What do you think?

You will get extra buffer length latency at the end. Usually, the drain latency is up to one period.

What you say makes sense if I'd drain the whole hw buffer of silence, but I did rather mean dropping once one knows the hardware has played back all "real" samples.

borine commented 5 months ago

When 0 < silence_threshold < buffer_size, then the driver ensures that there is always a number of silence samples ahead of the appl_ptr in the buffer, so that if the DAC reads past the appl_ptr (for example with imperfect drain, or when stop_threshold > buffer_size) then it reads silence, rather than replaying samples left over from the previous iteration of the buffer.

I agree that the documentation reference to inderrun for silence_threshold/size parameters is confusing. That is because the pcm will still stop when the xrun point is reached, so that the inserted silence samples never actually get played, unless the stop_threshold is also set appropriately (or when draining and the driver does not stop until the end of the period) . However I do not have any better ways of expressing the behavior of these parameters. Perhaps instead of "xrun" an expression such as "end of application samples" or similar might be clearer, or would that be even more confusing?

z-s-e commented 5 months ago

When 0 < silence_threshold < buffer_size, then the driver ensures that there is always a number of silence samples ahead of the appl_ptr in the buffer, so that if the DAC reads past the appl_ptr (for example with imperfect drain, or when stop_threshold > buffer_size) then it reads silence, rather than replaying samples left over from the previous iteration of the buffer.

Uhh, I'm even more confused now. What you described I kind of though was the silence_size parameter, not the silence_threshold. So coming from the documentation, I believed that when an underrun is less than silence_threshold frames away, the driver will write silence_size frames of silence after the appl_ptr. That is not correct?

And out of curiosity, is it also the driver's job to clear the samples when doing a rewind (since rewinding moves the appl_ptr back before previously written frames)?

Edit: Also, when the hardware does not have perfect drain, shouldn't one assume that an underrun will also not perfectly stop at the appl_ptr position but likely will overshoot that, making some silence frames at the end good for that?

borine commented 5 months ago

the driver will write silence_size frames of silence after the appl_ptr.

Actually it writes a maximum of silence_size frames. It may write less than that it reaches the hw_ptr first, because it does not want to overwrite unplayed frames.

s it also the driver's job to clear the samples when doing a rewind

No. It is the appl_ptr which is rewound, so the xrun point is made earlier (subject to the stop_threshold setting of course). The next write from the application will then overwrite frames from its last write. For use when draining, the idea is that the application pads with silence to ensure that the last period is complete, but the rewinds to the start of that padding. This means that drivers with perfect drain will stop at the (rewound) application pointer, never playing the silence padding; whereas those without perfect drain will run past the appl_ptr, playing the silence padding until the end of the period.

Also, when the hardware does not have perfect drain, shouldn't one assume that an underrun will also not perfectly stop at the appl_ptr position but likely will overshoot that, making some silence frames at the end good for that?

Hmm, I hadn't thought of that, but yes I think you are right, the silence threshold/silence size parameters would also be useful in that situation.

z-s-e commented 5 months ago

No. It is the appl_ptr which is rewound, so the xrun point is made earlier (subject to the stop_threshold setting of course). The next write from the application will then overwrite frames from its last write. For use when draining, the idea is that the application pads with silence to ensure that the last period is complete, but the rewinds to the start of that padding. This means that drivers with perfect drain will stop at the (rewound) application pointer, never playing the silence padding; whereas those without perfect drain will run past the appl_ptr, playing the silence padding until the end of the period.

Awesome, very useful info. Thanks! I believe that should be added to the documentation of rewind, will do that soon.

Also, when the hardware does not have perfect drain, shouldn't one assume that an underrun will also not perfectly stop at the appl_ptr position but likely will overshoot that, making some silence frames at the end good for that?

Hmm, I hadn't thought of that, but yes I think you are right, the silence threshold/silence size parameters would also be useful in that situation.

It is strange that you say that. The documentation of silence threshold/silence size only talks about the underrun situation, and does not mention draining at all. You primarily seem to consider them relevant for the draining case? But only when drain_silence is disabled? I'm still very confused about all this.

borine commented 5 months ago

The documentation of silence threshold/silence size only talks about the underrun situation

Yes, as I said, that is confusing because they are useful in other situations. When used in combination with the stop_threshold they can give an application more options on how to determine when a pcm should be stopped, allowing ii to go beyond the normal underrun point. Having said that, I don't know of any application that uses this - the only example I could find is the dmix plugin within alsa-lib itself.

z-s-e commented 5 months ago

I kind of question the usefulness of the drain state/operation in general, as one seemingly could always just as well simply intentionally wait for the underrun when one is done writing. With it we must now document how the different drain silence and sw silence parameter all interact with each other for the different sitautions (drain vs underrun)...

Anyway, just to re-confirm, do I understand correctly that when snd_pcm_hw_params_set_drain_silence is active, the silence_threshold/size parameters only potentially influence the underrun situation, as for when draining the alsa-lib will internally override that. But when snd_pcm_hw_params_set_drain_silence is deactivated, the silence_threshold/size parameters will also trigger in the draining state?

perexg commented 5 months ago

Anyway, just to re-confirm, do I understand correctly that when snd_pcm_hw_params_set_drain_silence is active, the silence_threshold/size parameters only potentially influence the underrun situation, as for when draining the alsa-lib will internally override that.

The internal implementation details should not be in documentation IMHO. The overriding is hidden (silence sw_params from the caller are returned back when drain is finished).

But when snd_pcm_hw_params_set_drain_silence is deactivated, the silence_threshold/size parameters will also trigger in the draining state?

Yes.

z-s-e commented 5 months ago

The internal implementation details should not be in documentation IMHO. The overriding is hidden (silence sw_params from the caller are returned back when drain is finished).

The documentation should be clear about what alsa will take care for you and what the app developer needs to do themselves, right?

I think what I want to do should be the most common use case: I simply want that glitchy old samples are never played back, neither when draining, nor when an underrun happens (of course I try to avoid underruns as much as possible, but since they still can happen, at least they shouldn't be glitchy). From the current documentation it is not clear how to archive this. Given this discussion so far:

When snd_pcm_hw_params_set_drain_silence is active, alsa will take care of the draining case, which is the default and the app developer does not have to do anything - unless the rewind functionality is used, in which case one needs to make sure to overwrite all rewound frames, potentially with silence padding, as alsa will not take care of silencing them automatically.
To prevent glitchy underruns, it seems the silence threshold/size parameters should still be set by the app developer even when snd_pcm_hw_params_set_drain_silence is active, right? What is the recommended setting there? Or would it make sense for alsa to also add a snd_pcm_hw_params_set_underrun_silence function so that is taken care of automatically as well and app developers don't have to mess with these lower level silence parameters?

perexg commented 5 months ago

The documentation should be clear about what alsa will take care for you and what the app developer needs to do themselves, right?

I meant that we should describe the expected API behavior not the internal implementation.

For rewind: I just checked the kernel and it seems that the silencing code is not called immediately when caller asks for it. It smells like a bug (omitted implementation) rather than an intended behavior. Possible kernel fix:

diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
index 39a65d1415ab..f5abcd630f8d 100644
--- a/sound/core/pcm_native.c
+++ b/sound/core/pcm_native.c
@@ -2974,6 +2974,9 @@ static snd_pcm_sframes_t rewind_appl_ptr(struct snd_pcm_substream *substream,
         * any longer.  Returning zero means that no rewind is done, so
         * it's not absolutely wrong to answer like that.
         */
+       if (ret >= 0 && substream->stream == SNDRV_PCM_STREAM_PLAYBACK &&
+                       runtime->silence_size > 0)
+               snd_pcm_playback_silence(substream, ULONG_MAX);
        return ret < 0 ? 0 : frames;
 }

@tiwai: FYI

add a snd_pcm_hw_params_set_underrun_silence

It's implementable and it may give a hint to drivers to handle this extra silencing.

On the other side, the current silencing is not a perfect solution without a smooth sample value change to zero implementation. So it does not omit pops completely, if the last played sample is too far from the middle.

z-s-e commented 5 months ago

On the other side, the current silencing is not a perfect solution without a smooth sample value change to zero implementation. So it does not omit pops completely, if the last played sample is too far from the middle.

Indeed, but playing silence is still better than random old samples from the last iteration.

I am in fact currently developing and will soon release an audio player engine library and application that tries extra hard to avoid any audio glitches under any circumstance; For pausing I use rewind to apply a short fade-out before stopping, and I try to do that as well shortly before an underrun as a "soft underrun" mechanism to avoid the pops you mentioned.

z-s-e commented 5 months ago

With my patch about the silence param interaction merged now, I am wondering if it is worth it to amend the snd_pcm_hw_params_set_drain_silence documentation to recommend just leaving this option enabled for most common use cases, just to be a bit more clear. But otherwise I suppose what this ticket was originally about is done, so could be closed in principle.

I think it would be good to track the rewind thing in another task. If you decide to merge your kernel-side fix, then I guess the documentation doesn't need to be updated. Otherwise the snd_pcm_rewind documentation should mention the need to manually overwrite rewound samples.

And potentially adding a new snd_pcm_hw_params_set_underrun_silence or otherwise document the recommended settings for such silencing underrun behavior better is also another task. So if you'd like I could close this one and create these two new tasks to keep things more organized.