thesofproject / linux

Linux kernel source tree
Other
91 stars 134 forks source link

ASoC: SOF: Check runtime for nullity in rx IPCs #5214

Closed cujomalainey closed 3 weeks ago

cujomalainey commented 1 month ago

There is a failure case where the DSP reports the status of a stream after the kernel has given up and closed out the pcm. When this happens the runtime parameter on the substream is nulled out by ASoC in the pcm_close call back. This results in a NULL pointer derefence if not checked.

cujomalainey commented 1 month ago

Note: I have been unable to get the SOF kernel to boot on a device so this is untested. @dbaluta I had a hard time checking if this is also needed for the compressed offload use case. Can you confirm?

cujomalainey commented 1 month ago

Confirmed my kernel issues are unrelated to this change.

cujomalainey commented 1 month ago

@cujomalainey, it is valid, given that we never clear the sps->substream, it is not going to be NULL after the first audio activity.

Please read the commit message fully, I understand SOF never clears runtime, but ASoC does NULL it out.

See https://github.com/thesofproject/linux/blob/topic/sof-dev/sound/soc/soc-dapm.c#L4059 and https://github.com/thesofproject/linux/blob/topic/sof-dev/sound/soc/soc-pcm.c#L1709

ujfalusi commented 1 month ago

@cujomalainey, it is valid, given that we never clear the sps->substream, it is not going to be NULL after the first audio activity.

Please read the commit message fully, I understand SOF never clears runtime, but ASoC does NULL it out.

See https://github.com/thesofproject/linux/blob/topic/sof-dev/sound/soc/soc-dapm.c#L4059 and https://github.com/thesofproject/linux/blob/topic/sof-dev/sound/soc/soc-pcm.c#L1709

@cujomalainey, I have said nothing about runtime (and yes, ASoC clears it) I was saying the sps->substream in SOF, which is pointing to the substream, which have the pointer to runtime, which is cleared by ASoC.

cujomalainey commented 1 month ago

@cujomalainey, it is valid, given that we never clear the sps->substream, it is not going to be NULL after the first audio activity.

Please read the commit message fully, I understand SOF never clears runtime, but ASoC does NULL it out. See https://github.com/thesofproject/linux/blob/topic/sof-dev/sound/soc/soc-dapm.c#L4059 and https://github.com/thesofproject/linux/blob/topic/sof-dev/sound/soc/soc-pcm.c#L1709

@cujomalainey, I have said nothing about runtime (and yes, ASoC clears it) I was saying the sps->substream in SOF, which is pointing to the substream, which have the pointer to runtime, which is cleared by ASoC.

Apologies it sounds like your original message was a counter to patch in that the fix was not needed.

cujomalainey commented 3 weeks ago

I personally think this is the proper fix, also friendly ping for reviews since CI failures are in unrelated modules to the change.

ujfalusi commented 3 weeks ago

I personally think this is the proper fix, also friendly ping for reviews since CI failures are in unrelated modules to the change.

Have you by any chance tested my proposal?

diff --git a/sound/soc/sof/pcm.c b/sound/soc/sof/pcm.c
index 62ce707abaa6..7a55bc65bdf1 100644
--- a/sound/soc/sof/pcm.c
+++ b/sound/soc/sof/pcm.c
@@ -511,6 +511,8 @@ static int sof_pcm_close(struct snd_soc_component *component,
         */
    }

+   spcm->stream[substream->stream].substream = NULL;
+
    return 0;
 }

diff --git a/sound/soc/sof/stream-ipc.c b/sound/soc/sof/stream-ipc.c
index 794c7bbccbaf..8262443ac89a 100644
--- a/sound/soc/sof/stream-ipc.c
+++ b/sound/soc/sof/stream-ipc.c
@@ -43,7 +43,7 @@ int sof_ipc_msg_data(struct snd_sof_dev *sdev,
                return -ESTRPIPE;

            posn_offset = stream->posn_offset;
-       } else {
+       } else if (sps->cstream) {

            struct sof_compr_stream *sstream = sps->cstream->runtime->private_data;

@@ -51,6 +51,10 @@ int sof_ipc_msg_data(struct snd_sof_dev *sdev,
                return -ESTRPIPE;

            posn_offset = sstream->posn_offset;
+
+       } else {
+           dev_err(sdev->dev, "%s: No stream opened\n", __func__);
+           return -EINVAL;
        }

        snd_sof_dsp_mailbox_read(sdev, posn_offset, p, sz);

I don't have the setup to reproduce this issue.

lgirdwood commented 3 weeks ago

I personally think this is the proper fix, also friendly ping for reviews since CI failures are in unrelated modules to the change.

Have you by any chance tested my proposal?

@ujfalusi can you add more context why this proposal is better than the current PR, i.e. what is your proposal checking that the current PR does not. This will help us align quickly since we have a crash.

ujfalusi commented 3 weeks ago

The sps->cstream is cleared to NULL on free while the sps->stream is left to contain a stale pointer. I think it is better to align the PCM and compressed handling by clearing the corresponding stream/cstream pointer and also add the missing sps->cstream nullity check in sof_ipc_msg_data() - which is again aligning with the rest of the code, sof_set_stream_data_offset() is checking this.

cujomalainey commented 3 weeks ago

The sps->cstream is cleared to NULL on free while the sps->stream is left to contain a stale pointer. I think it is better to align the PCM and compressed handling by clearing the corresponding stream/cstream pointer and also add the missing sps->cstream nullity check in sof_ipc_msg_data() - which is again aligning with the rest of the code, sof_set_stream_data_offset() is checking this.

Technically viable, but this is changing lifetypes of a pointer, I think a simple and safe fix to stop the bleeding to should go out first. If you want to make a longer term fix then be my guest but that should not block a fix for a bug in the field.

ujfalusi commented 3 weeks ago

Technically viable, but this is changing lifetypes of a pointer, I think a simple and safe fix to stop the bleeding to should go out first. If you want to make a longer term fix then be my guest but that should not block a fix for a bug in the field.

The bug is that the lifetime of sps->stream is not managed. The other bug is that the sps->cstream nullity is not checked in sof_ipc_msg_data() and it assumes that if sps->stream is NULL then sps->cstream->runtime must be valid, which is not true asthe sps->cstream lifetime is managed.

if for any reason the firmware sends a position notification before the first playback is started (sps->stream/cstream is still NULL) then we will have NULL dereference with the sps->cstream, but not in sof_set_stream_data_offset(), which handles things correctly.

cujomalainey commented 3 weeks ago

Technically viable, but this is changing lifetypes of a pointer, I think a simple and safe fix to stop the bleeding to should go out first. If you want to make a longer term fix then be my guest but that should not block a fix for a bug in the field.

The bug is that the lifetime of sps->stream is not managed. The other bug is that the sps->cstream nullity is not checked in sof_ipc_msg_data() and it assumes that if sps->stream is NULL then sps->cstream->runtime must be valid, which is not true asthe sps->cstream lifetime is managed.

if for any reason the firmware sends a position notification before the first playback is started (sps->stream/cstream is still NULL) then we will have NULL dereference with the sps->cstream, but not in sof_set_stream_data_offset(), which handles things correctly.

Fair point, I was viewing it as substream existed before PCM open, if that is not the case then yours makes sense.