thesofproject / linux

Linux kernel source tree
Other
91 stars 133 forks source link

ASoC: SOF: sof-pcm/pm: Stop paused streams before the system suspend #5058

Open ujfalusi opened 5 months ago

ujfalusi commented 5 months ago

Paused streams will not receive a suspend trigger, they will be marked by ALSA core as suspended and it's state is saved. Since the pause stream is not in a fully stopped state, for example DMA might be still enabled (just the trigger source is removed/disabled) we need to make sure that the hardware is ready to handle the suspend.

This involves a bit more than just stopping a DMA since we also need to communicate with the firmware in a delicate sequence to follow IP programming flows. To make things a bit more challenging, these flows are different between IPC versions due to the fact that they use different messages to implement the same functionality.

To avoid adding yet another path, callbacks and sequencing for handling the corner case of suspending while a stream is paused, and do this for each IPC versions and platforms, we can move the stream back to running just to put it to stopped state.

Explanation of the change: Streams moved to SUSPENDED state from PAUSED without trigger. If a stream does not support RESUME then on system resume the RESUME trigger is not sent, the stream's state and suspended_state remains untouched. When the user space releases the pause then the core will reject this because the state of the stream is not PAUSED, it is still SUSPENDED.

From this point user space will do the normal (hw_params) prepare and START, PAUSE_RELEASE trigger will not be sent by the core after the system has resumed.

Link: https://github.com/thesofproject/linux/issues/5035

ujfalusi commented 5 months ago

Right, so this is a resurrection of https://github.com/thesofproject/linux/pull/5040 with extended comments and explanation.

Replaces: https://github.com/thesofproject/linux/pull/5054 With this patch all of the bugs can remain as they are not going to be hit (we are stopping the paused stream before suspend after all) and I did not wanted to water up this PR with 'unrelated' patches.

ujfalusi commented 5 months ago

To note: this patch can handles so far all cases that I have thrown at it as long as we don't support RESUME, With RESUME advertised the system locks up and parts of https://github.com/thesofproject/linux/pull/5054 is needed to fix the lock, but we really cannot handle this corner case in any other sane way.

ujfalusi commented 5 months ago

I think the important note is that this is fixing an extremely rare corner case. It is as rare as I think it is broken for sever years and we did not received a single report of it, but if it happens your machine will lock up hard.

plbossart commented 4 months ago

Coming back after a short break, I am not sure why this is needed?

we have code in hda_dsp_dais_suspend() that supposedly takes care of the suspend-while-paused configuration. I am 100% sure we tested this with @ranj063

So the questions are:

a) what is missing in the existing code? b) what is new in the suggested PR?

ujfalusi commented 4 months ago

Coming back after a short break, I am not sure why this is needed?

we have code in hda_dsp_dais_suspend() that supposedly takes care of the suspend-while-paused configuration. I am 100% sure we tested this with @ranj063

I'm also 100% sure that this was tested, but a long-long time ago and it got broken at some point (it was not teted in CI or by developers).

So the questions are:

a) what is missing in the existing code?

A completely new state machine addition to handle this corner case for IPC4 and for IPC3.

b) what is new in the suggested PR?

This PR cuts the Gordian-knot by placing the firmware, hardware and kernel states to what they should be using existing and well tested flows.

plbossart commented 4 months ago

Still not following if we need to keep the existing code in hda_dsp_dais_suspend() ??

I am now starting to wonder if the issue only happens with IPC4.

ujfalusi commented 4 months ago

Still not following if we need to keep the existing code in hda_dsp_dais_suspend() ??

In theory the hda_dsp_dais_suspend() will be a NOP as we stop all paused streams properly and not leaving anything to be handled later, but I would wait a bit for the removal.

I am now starting to wonder if the issue only happens with IPC4.

The issue happens with both IPC3 and IPC4.

ranj063 commented 4 months ago

@ujfalusi can I suggest a different approach? ALSA suspends stream only if the dpcm state is START. How about we add PAUSED to this also? I mean invoke the dai suspend trigger but leave the state as PAUSED. That way we wont have to do any special handling in the SOF driver at all?

ujfalusi commented 4 months ago

@ujfalusi can I suggest a different approach? ALSA suspends stream only if the dpcm state is START. How about we add PAUSED to this also? I mean invoke the dai suspend trigger but leave the state as PAUSED. That way we wont have to do any special handling in the SOF driver at all?

@ranj063, I did thought about that, but that will have effect on all drivers. I cannot predict how they will handle this fundamental change.

ranj063 commented 4 months ago

@ranj063, I did thought about that, but that will have effect on all drivers. I cannot predict how they will handle this fundamental change.

@ujfalusi why not make it opt-in? One suggestion I have is to make this a property of struct snd_soc_pcm_runtime that can be set by the driver opting-in in the be_hw_params_fixup callback

ujfalusi commented 4 months ago

@ranj063, I did thought about that, but that will have effect on all drivers. I cannot predict how they will handle this fundamental change.

@ujfalusi why not make it opt-in? One suggestion I have is to make this a property of struct snd_soc_pcm_runtime that can be set by the driver opting-in in the be_hw_params_fixup callback

it is ALSA core which stops the triggers.

ranj063 commented 4 months ago

@ranj063, I did thought about that, but that will have effect on all drivers. I cannot predict how they will handle this fundamental change.

@ujfalusi why not make it opt-in? One suggestion I have is to make this a property of struct snd_soc_pcm_runtime that can be set by the driver opting-in in the be_hw_params_fixup callback

it is ALSA core which stops the triggers.

@ujfalusi this is what I had in mind

diff --git a/include/sound/soc.h b/include/sound/soc.h
index 33671437ee89..62664631fbc0 100644
--- a/include/sound/soc.h
+++ b/include/sound/soc.h
@@ -1210,6 +1210,7 @@ struct snd_soc_pcm_runtime {

        int num_components;
        struct snd_soc_component *components[]; /* CPU/Codec/Platform */
+       bool suspend_trigger_on_pause;
 };

 /* see soc_new_pcm_runtime()  */
diff --git a/sound/soc/soc-pcm.c b/sound/soc/soc-pcm.c
index 711b2f49ed88..86f3f7f05581 100644
--- a/sound/soc/soc-pcm.c
+++ b/sound/soc/soc-pcm.c
@@ -2266,8 +2266,14 @@ int dpcm_be_dai_trigger(struct snd_soc_pcm_runtime *fe, int stream,

                        break;
                case SNDRV_PCM_TRIGGER_SUSPEND:
-                       if (be->dpcm[stream].state != SND_SOC_DPCM_STATE_START)
+                       if (be->dpcm[stream].state != SND_SOC_DPCM_STATE_START) {
+                               /* send suspend trigger for paused stream if opted-in. Leave the state as paused though */
+                               if (be->dpcm[stream].state == SND_SOC_DPCM_STATE_PAUSED &&
+                                   be->suspend_trigger_on_pause)
+                                       ret = soc_pcm_trigger(be_substream, cmd);
+
                                goto next;
+                       }

                        be->dpcm[stream].be_start--;
                        if (be->dpcm[stream].be_start != 0)
diff --git a/sound/soc/sof/pcm.c b/sound/soc/sof/pcm.c
index baad4c1445aa..9eb45620eb34 100644
--- a/sound/soc/sof/pcm.c
+++ b/sound/soc/sof/pcm.c
@@ -617,6 +617,9 @@ int sof_pcm_dai_link_fixup(struct snd_soc_pcm_runtime *rtd, struct snd_pcm_hw_pa
                return 0;
        }

+       /* set the flag to receive the suspend trigger for paused streams */
+       rtd->suspend_trigger_on_pause = true;
+
        if (pcm_ops && pcm_ops->dai_link_fixup)
                return pcm_ops->dai_link_fixup(rtd, params);
ujfalusi commented 4 months ago

@ranj063, that is all fine, but the SUSPEND will not reach that point since in sound/core/pcm_native.c we have this:

static int snd_pcm_do_suspend(struct snd_pcm_substream *substream,
                  snd_pcm_state_t state)
{
    struct snd_pcm_runtime *runtime = substream->runtime;
    if (runtime->trigger_master != substream)
        return 0;
    if (! snd_pcm_running(substream))
        return 0;
    substream->ops->trigger(substream, SNDRV_PCM_TRIGGER_SUSPEND);
    runtime->stop_operating = true;
    return 0; /* suspend unconditionally */
}

The trigger is 'eaten' up in core level.

ranj063 commented 4 months ago
snd_pcm_running

Can we extend this to snd_pcm_runtime also to do the same or do you think its too invasive?

ujfalusi commented 4 months ago
snd_pcm_running

Can we extend this to snd_pcm_runtime also to do the same or do you think its too invasive?

Since this trigger sequence has never been tested (as it was not even possible to get SUSPEND while PAUSED) it is highly likely that it is broken. I have reservations on how simple is to handle this also.

ranj063 commented 4 months ago

Since this trigger sequence has never been tested (as it was not even possible to get SUSPEND while PAUSED) it is highly likely that it is broken. I have reservations on how simple is to handle this also.

But you're trying to do the same within our driver anyway no by calling the suspend trigger for paused streams? Wouldnt it make sense to simplify it?

ranj063 commented 4 months ago

But you're trying to do the same within our driver anyway no by calling the suspend trigger for paused streams? Wouldnt it make sense to simplify it?

@ujfalusi how about consulting with Mark or Takashi about this instead of assuming that we have to deal with this in the SOF driver?

ujfalusi commented 4 months ago

Since this trigger sequence has never been tested (as it was not even possible to get SUSPEND while PAUSED) it is highly likely that it is broken. I have reservations on how simple is to handle this also.

But you're trying to do the same within our driver anyway no by calling the suspend trigger for paused streams? Wouldnt it make sense to simplify it?

Not really, I'm moving the paused stream to RUNNING and then to STOPPED. If we move it to SUSPENDED then there were other issues popping up. After system resume a SUSPENDED stream is handled differently than a PAUSED stream, given the SNDRV_PCM_INFO_RESUME is not set.

For example: if the stream was PAUSED, it will not receive the RESUME trigger...

lgirdwood commented 1 month ago

@ujfalusi @ranj063 still needed or can we close ?

ranj063 commented 1 month ago

@ujfalusi @ranj063 still needed or can we close ?

@ujfalusi should we take the issue up with Mark/Takashi?

ujfalusi commented 1 month ago

@ranj063, yes, it is just I never had time to think about he proper not too long intro.. @lgirdwood, yes, this or something along the lines is a must.

lgirdwood commented 1 month ago

@ranj063, yes, it is just I never had time to think about he proper not too long intro.. @lgirdwood, yes, this or something along the lines is a must.

ok, pls confirm this only impacts paused streams at suspend and normal running streams are stopped properly ?

If so - we dont have any impact atm as none of our pipeline advertise pause today. We can revisit this though if ever needed.