thesofproject / sof

Sound Open Firmware
Other
547 stars 313 forks source link

[BUG] IPC4 fails to set up stream metadata correctly #8639

Open andyross opened 10 months ago

andyross commented 10 months ago

Component stream state in IPC4 doesn't seem to be set up reliably. When the sof-mtl-max98357a-rt5682-ssp2-ssp0 topology (built form the mtl-007-drop-stable branch), the Google AEC component is being presented with a reference/playback input source that claims 4 channels, but in fact is only providing enough data for two.

Similarly, the computed "alignment constants" on a buffer's audio stream start out with (I think) garbage, and require that something somewhere call source/sink_set_alignment_constants() with arbitrary (and even incorrect, e.g. a byte alignment of 1 was being used!) arguments just to get the fields initialized.

The component has always had code present (see this commit for the current form: https://github.com/thesofproject/sof/pull/8571/commits/bab124dda81aac088fa789a6e1167403b288df90) to fix this at prepare() time. But it's fragile, undocumented, got dropped during refactoring and took me DAYS to figure out; the symptoms were just ring buffer overruns in the downstream components that produced "mild distortion".

This just isn't right. Components don't own their sinks/sources and shouldn't ever be responsible for configuring them. The pipeline/IPC layer needs to do this correctly from topology, and in the event that the data is wrong or mismatched, it needs to fail gracefully. We shouldn't ever be in a situation like this where a source promises 8 byte frames per sample period but delivers 4.

lgirdwood commented 10 months ago

@marcinszkudlinski can you comment on a solution ?

marcinszkudlinski commented 10 months ago

Yes, I know there are inconsistences at number of channels. Other parameters are set properly

that's why I introduced a WA to mtl 007 branch:

https://github.com/thesofproject/sof/blob/6d3061c76f23f5e92ec89578476006f6fe3b4cd4/src/audio/google/google_rtc_audio_processing.c#L96

problems are being investigated

andyross commented 10 months ago

Add @andrula-song as assignee given expert-seeming comments in #8571

lgirdwood commented 10 months ago

@ujfalusi can you give this some priority. Thanks. @plbossart fyi, driver may be sending incorrect IPC4 data, we are not sure yet.

ujfalusi commented 10 months ago

@andyross, can you attach kernel and firmware logs with pointers where things go south? I'm not sure if I follow the description. If the sof-mtl-max98357a-rt5682-ssp2-ssp0 represents the Google AEC with 4 channels and it should not be like that then the issue is in topology?

pipelines are prepared in sequence which means that one side of the module likely will not prepared when it is prepared. I'm not sure what is the direction in this case.

The code for src/audio/google/google_rtc_audio_processing.c differs quite a bit between main, mtl-007-drop-stable and wherever https://github.com/thesofproject/sof/commit/bab124dda81aac088fa789a6e1167403b288df90 is.

lgirdwood commented 10 months ago

Adding @cujomalainey @johnylin76 since @andyross out till Friday.

cujomalainey commented 10 months ago

@lkoenig is the other person working on this whom might have logs.

plbossart commented 10 months ago

I am afraid there is nothing in the kernel that deals with the number of channels. All information on configurations are provided by topology and sent to firmware as is.

I do remember talking with @cujomalainey on the AEC/DMIC integration. The premise for DMICs since GLK is that we ALWAYS deal with 4 channels, and userspace/UCM extract the useful channels based on platform-specific channel maps.

But if the DMIC input is provided to AEC, then how would the firmware know which channels are to be dealt with? Conversely the echo reference is two channels, so this really begs the question on how the echo reference is mapped to which input channels for processing.

So yes there's clearly a 2:4 mismatch in the AEC integration...

lgirdwood commented 10 months ago

@marcinszkudlinski @plbossart it sounds like we could add a topology tuple(s) for AEC DMIC channel mapping. i.e. AEC echo ref is channels mask 0x3

andyross commented 10 months ago

To clarify: the DMIC source to AEC does indeed represent itself correctly as 4 channels. And that's indeed a problem that needs solving (right now AEC simply drops the second two). But at the source/sink metadata level, everything is correct. It's promising us 4x s16 @48kHz and delivering the expected 8 bytes per 1/48000 seconds.

This bug is that the reference input (demuxed from the playback pipeline by the copier) is also listing 4 channels, even though it's clearly only providing two. And AFAICT the topology shows two there. I don't know where the "4" is coming from.

cujomalainey commented 10 months ago

But if the DMIC input is provided to AEC, then how would the firmware know which channels are to be dealt with? Conversely the echo reference is two channels, so this really begs the question on how the echo reference is mapped to which input channels for processing.

@johnylin76 has been doing some excellent work in CRAS to enable easy dynamic configuration of SOF. The main target is DRC/EQ configs from CRAS format. That being said, it would not be hard to extend that to a channel map which we have on hand already and write that down to the firmware. We could even update it for WF vs UF mics.

andyross commented 10 months ago

@andyross, can you attach kernel and firmware logs with pointers where things go south?

Actually I can't! :) The symptom if you're missing the alignment_constants calls is just ring buffer madness downstream of the component. The only symptom is that it sounds like junk. The proximate cause there seems to be (from code reading, I won't have a chance to try this until I get back to the box on Friday night) just uninitialized data in the struct. The constructor for the audio_stream in the buffer never sets the values, leaving them either at the calloc() zeros or heap garbage, and neither is correct somehow. (The actual implementation of set_alignment_constants computes values from the arguments, it can't just be left at a zero). I'm hopeful that just making sure the initialization code calls this with reasonable values is going to be enough. Then we can go around and remove all the seemingly-stubbed calls to this function[1] from all the components I see have had them added. :)

The symptom from the incorrect channel count is just short bytes. The AEC falls behind trying to read 10ms of data from the reference pipe for every 10ms of capture, so it iterates only once in 20ms, producing a flood of errors at the upstream DAI and otherwise being easier to diagnose.

[1] Most of them pass "1" as the needed alignment, which I can guarantee is incorrect for existing formats as Xtensa won't do a misaligned 16/32 bit load except in a handful of instructions with special addressing.

andyross commented 10 months ago

it sounds like we could add a topology tuple(s) for AEC DMIC channel mapping. i.e. AEC echo ref is channels mask 0x3

That would work, and/or we could trivially do that internal to the component too with a control (somewhat oddly tuples seem harder to get to in component code than controls, or I just haven't found the right API yet).

I think the bigger question is policy: there's a software constraint (AEC can only process two channels) and a hardware behavior (there are four mic channels). We need a policy decision as to which channels to process and how to expose that to host code via PCMs. Bluntly: why are there four microphones and what are we supposed to do with them?

yongzhi1 commented 10 months ago

FYI, for this issue "the Google AEC component is being presented with a reference/playback input source that claims 4 channels, but in fact is only providing enough data for two."

Here is the orginal fix https://github.com/thesofproject/sof/pull/8452 and the channel work around was removed by https://github.com/thesofproject/sof/commit/e6a3538edbf71c6ad18b09b1c679cb2fe8f1c4a4 on mtl-007-drop-stable.

andyross commented 10 months ago

Here is the orginal fix https://github.com/thesofproject/sof/pull/8452 and the channel work around was removed by https://github.com/thesofproject/sof/commit/e6a3538edbf71c6ad18b09b1c679cb2fe8f1c4a4 on mtl-007-drop-stable.

That's just a(nother![1]) workaround at the component level though. The component isn't responsible for configuring its input sources, right? The copier/DAI for the reference input clearly is two channel, the sof_source created from it should show the same metadata. How do we fix this so the pipeline code does it correctly?

[1] I'll absolutely pull this in to the current workaround patch, obviously. It's cleaner than detecting the mismatch and calling source_set_channels(), as at least it's sourced from the module base_cfg (which I assume comes out of topology?)

andyross commented 10 months ago

That's just a(nother![1]) workaround at the component level though.

Just to restate my perspective here: the incorrect channel count and wrong alignment data are fields in a struct audio_stream stream inside a struct comp_buffer. Those objects are not "owned" by the component code (and can't be, since they connect two different components). They are created by the IPC/pipeline layer on the components' behalf, based on topology. They need to be initialized correctly.

andyross commented 10 months ago

I spent most of today reverse-engineering the changes in the most recent mtl-007-drop-stable and can duplicate results on my tree. Honestly, this bug has to stay open. Here's the isolated workaround patch on top of my temporary tree (I'll get it rebased on main, retested on mt8195, and submitted tomorrow): https://github.com/andyross/sof/commit/e3345b7a5924f2fab950ae6e1d668c20d1788593

First: please read the commit message (duplicated in comments) carefully. The details here do seem extremely complicated.

Second: please check my work. It's certainly possible I've missed something.

The root causes here were, in full:

  1. IPC4 doesn't initialize the audio_streams in comp_buffers completely. Where IPC3 components seem to have code to call init_alignment_constants() themselves, newer components don't. Really I think this is an old bug in audio_stream: the fields in question can't be set to zero, they need working defaults at creation time. I'll submit a patch.

  2. IPC4 doesn't set module stream data at prepare() time, or does it incompletely. Components are responsible for calling source/sink_update_audio_format() themselves.

  3. After initialization, IPC4 throws away (!) the data needed to determine IBS/OBS values on components with more than one source or sink. Components need to make a copy themselves if they want the pipeline setup to use it, which is... very weird. This breaks the AEC reference stream, which has half the channels of the capture stream and thus needs smaller buffers. This was the source of the persistent falling behind I reported in #8640

My position is that none of that code should be required to be present in a working component. Those tasks are all the job of the IPC/module/pipeline setup code. If I'm wrong, are any of those requirements documented? This all seems ridiculous to me, though the workarounds are real and do seem to work.

kv2019i commented 9 months ago

In addition to #8672 @andyross submitted, there's https://github.com/thesofproject/sof/pull/8575 to address point (3) in upstream.

andyross commented 9 months ago

Been working on that, actually. See #8685 for IMHO a much cleaner take on the "base_cfg_ext" problem that doesn't require module code do weird gymnastics managing memory in the core module layers. It is, admittedly, pretty opinionated. But it's a lot smaller, much simpler for component code, and as mentioned I do feel like we've gotten stuck with a really bad API paradigm here.

With that, the big workaround patch in the AEC series linked above is down to just one line that looks like:

#ifdef CONFIG_IPC_MAJOR_4
    ipc4_update_source_format(sources[0], &mod->priv.cfg.input_pins[1].audio_fmt);
#endif

And I'm pretty sure I see where the problem is there and hope to have a fix out tomorrow.

abonislawski commented 9 months ago

@andyross is this issue good to close now?

andyross commented 8 months ago

Let's leave this open for another week or two while I get the AEC rework rebased. The actual stream format for multi-output copier objects is still being set incorrectly, because the copier doesn't get the cfg_ext data that would contain the correct value, because it has its own config struct in the protocol where cfg_ext would go. It's a protocol bug basically, the topology isn't transmitted to the firmware in a single format or a single place, every component is rolling its own.

Basically I need to write up the protocol stuff in a new bug before closing this.