thesofproject / sof

Sound Open Firmware
Other
562 stars 319 forks source link

zephyr: enable HIFI_SHARING if DP scheduler is enabled with HIFI #9644

Open kv2019i opened 2 weeks ago

kv2019i commented 2 weeks ago

If data processing (DP) scheduler is enabled on Xtensa HiFi platforms, CONFIG_HIFI_SHARING should be enabled as well. Otherwise modules running via DP scheduler cannot use HiFi extensions (as the HiFi register state is not saved and restored on context switches).

lgirdwood commented 2 weeks ago

@wszypelt @lrudyX good to merge ?

kv2019i commented 2 weeks ago

@wszypelt @lgirdwood It seems the fail is a System.BadImageFormatException on TGL only, so could be an issue with test infra and not the PR. In general, this is a rather invasive change, so regressions are also possible.

lgirdwood commented 1 week ago

SOFCI TEST

lgirdwood commented 1 week ago

@wszypelt @lgirdwood It seems the fail is a System.BadImageFormatException on TGL only, so could be an issue with test infra and not the PR. In general, this is a rather invasive change, so regressions are also possible.

Lets run CI again to be sure.

lgirdwood commented 1 week ago

@wszypelt @lgirdwood It seems the fail is a System.BadImageFormatException on TGL only, so could be an issue with test infra and not the PR. In general, this is a rather invasive change, so regressions are also possible.

Lets run CI again to be sure.

Results as expected so looks fine. @wszypelt @lrudyX good to merge (CI still pending) ?

wszypelt commented 1 week ago

@lgirdwood @kv2019i first there was a problem with one of the machines, after fix there is a issue IPC error 6 in updownmixer for PTL, I still need some time to check it out

marcinszkudlinski commented 1 week ago

According to @wszypelt findings

This change in fact is enabling HiFi preemption, developed by Zephyr some time ago. I was under impression it had been enabled in SOF right after Zephyr change was merged, but looks like it had not.

Such change at kernel level may of course cause a random and hard to track crashes because of races etc. and the result above suggest we do have such issue :(

Required test - start a pipeline with an LL module using HiFI and a DP module (on the same core. also using hifi) with a long processing time (10ms). This would enforce HiFI preemption. Run such test several times.

So at the moment DON"T MERGE IT, we need to prove that HiFI preemption is a problem and go back to Zephyr team for investigation

lgirdwood commented 1 week ago

@rljordan @nashif fyi - looks like we need to deep dive Zephyr HiFi context switching as 1st step.

lgirdwood commented 3 days ago

@kv2019i @marcinszkudlinski any updates ?

Have we tried rerunning CI and downgrade HIFI version to see if this changes result ? i.e it could be a HiFi4 only issue that would shine the light on any new Hifi4 logic/registers.

Also HiFi has audio processing instructions (which would probably glitch/distort) and also some general purpose instructions/registers (e.g. loops, loads, stores) that may more likely be the culprit here if we are seeing unexpected IPC results.

One other thing is that I dont think the HiFi version Kconfig has impact outside of modules today (since it only for intrinsics), i.e. general purpose code would still use HiFi version that's baked into CC.

marcinszkudlinski commented 3 days ago

@lgirdwood @kv2019i we DO need hifi preemption @wszypelt pls provide test

Debugging won't be easy :(

wszypelt commented 2 days ago

@kv2019i @marcinszkudlinski sure thing, we are during creating new test, but we still need some time