Open kv2019i opened 2 weeks ago
@wszypelt @lrudyX good to merge ?
@wszypelt @lgirdwood It seems the fail is a System.BadImageFormatException on TGL only, so could be an issue with test infra and not the PR. In general, this is a rather invasive change, so regressions are also possible.
SOFCI TEST
@wszypelt @lgirdwood It seems the fail is a System.BadImageFormatException on TGL only, so could be an issue with test infra and not the PR. In general, this is a rather invasive change, so regressions are also possible.
Lets run CI again to be sure.
@wszypelt @lgirdwood It seems the fail is a System.BadImageFormatException on TGL only, so could be an issue with test infra and not the PR. In general, this is a rather invasive change, so regressions are also possible.
Lets run CI again to be sure.
Results as expected so looks fine. @wszypelt @lrudyX good to merge (CI still pending) ?
@lgirdwood @kv2019i first there was a problem with one of the machines, after fix there is a issue IPC error 6 in updownmixer for PTL, I still need some time to check it out
According to @wszypelt findings
This change in fact is enabling HiFi preemption, developed by Zephyr some time ago. I was under impression it had been enabled in SOF right after Zephyr change was merged, but looks like it had not.
Such change at kernel level may of course cause a random and hard to track crashes because of races etc. and the result above suggest we do have such issue :(
Required test - start a pipeline with an LL module using HiFI and a DP module (on the same core. also using hifi) with a long processing time (10ms). This would enforce HiFI preemption. Run such test several times.
So at the moment DON"T MERGE IT, we need to prove that HiFI preemption is a problem and go back to Zephyr team for investigation
@rljordan @nashif fyi - looks like we need to deep dive Zephyr HiFi context switching as 1st step.
@kv2019i @marcinszkudlinski any updates ?
Have we tried rerunning CI and downgrade HIFI version to see if this changes result ? i.e it could be a HiFi4 only issue that would shine the light on any new Hifi4 logic/registers.
Also HiFi has audio processing instructions (which would probably glitch/distort) and also some general purpose instructions/registers (e.g. loops, loads, stores) that may more likely be the culprit here if we are seeing unexpected IPC results.
One other thing is that I dont think the HiFi version Kconfig has impact outside of modules today (since it only for intrinsics), i.e. general purpose code would still use HiFi version that's baked into CC.
@lgirdwood @kv2019i we DO need hifi preemption @wszypelt pls provide test
Debugging won't be easy :(
@kv2019i @marcinszkudlinski sure thing, we are during creating new test, but we still need some time
If data processing (DP) scheduler is enabled on Xtensa HiFi platforms, CONFIG_HIFI_SHARING should be enabled as well. Otherwise modules running via DP scheduler cannot use HiFi extensions (as the HiFi register state is not saved and restored on context switches).