thesofproject / sof

Sound Open Firmware
Other
563 stars 319 forks source link

perf_cnt: improve avg/peak accuracy for component perf measurements #9664

Closed kv2019i closed 4 days ago

kv2019i commented 1 week ago

Build on top of #9661 and improve the PERFORMANCE_COUNTERS infrastructure:

Marking as draft as this requires #9661 to be merged first.

kv2019i commented 1 week ago

With all patches applied, results look like this:

[1539781.868583] <inf> component: comp_copy: comp:1 0x3 perf comp_copy samples 48 period 1000 cpu avg 1237 peak 1250 3
[1539781.868605] <inf> component: comp_copy: comp:1 0x10006 perf comp_copy samples 48 period 1000 cpu avg 1752 peak 2068 65
[1539781.868626] <inf> component: comp_copy: comp:1 0x10004 perf comp_copy samples 48 period 1000 cpu avg 5588 peak 5640 478
[1539781.869616] <inf> component: comp_copy: comp:0 0x4 perf comp_copy samples 48 period 1000 cpu avg 2903 peak 2968 221
[1539781.869628] <inf> component: comp_copy: comp:0 0x6 perf comp_copy samples 48 period 1000 cpu avg 1605 peak 1892 795
[1539781.869658] <inf> component: comp_copy: comp:0 0x2 perf comp_copy samples 48 period 1000 cpu avg 1996 peak 2070 1023

With logging overhead and HDA ISRs removed, peaks are now in hundreds of DSP cycles.

lgirdwood commented 1 week ago

With all patches applied, results look like this:

[1539781.868583] <inf> component: comp_copy: comp:1 0x3 perf comp_copy samples 48 period 1000 cpu avg 1237 peak 1250 3
[1539781.868605] <inf> component: comp_copy: comp:1 0x10006 perf comp_copy samples 48 period 1000 cpu avg 1752 peak 2068 65
[1539781.868626] <inf> component: comp_copy: comp:1 0x10004 perf comp_copy samples 48 period 1000 cpu avg 5588 peak 5640 478
[1539781.869616] <inf> component: comp_copy: comp:0 0x4 perf comp_copy samples 48 period 1000 cpu avg 2903 peak 2968 221
[1539781.869628] <inf> component: comp_copy: comp:0 0x6 perf comp_copy samples 48 period 1000 cpu avg 1605 peak 1892 795
[1539781.869658] <inf> component: comp_copy: comp:0 0x2 perf comp_copy samples 48 period 1000 cpu avg 1996 peak 2070 1023

With logging overhead and HDA ISRs removed, peaks are now in hundreds of DSP cycles.

One thing - can we format align the comp IDs so that all text is aligned for each field. Makes it easier to parse. Btw, why cant we print module name for humans to read ?

kv2019i commented 1 week ago

@lgirdwood wrote:

One thing - can we format align the comp IDs so that all text is aligned for each field. Makes it easier to parse. Btw, why cant we print module name for humans to read ?

Probably best not to touch the formating as these are already parsed by sof-test/tools/sof_perf_analyzer.py , which also adds the module names (using info from kernel log) and makes this more human readable. E.g:

# run test case, capture kernel (dmesg.txt) and FW (mtrace.txt) logs
$ sof-test/tools/sof_perf_analyzer.py --kmsg dmesg.txt mtrace.txt
     COMP_ID                   COMP_NAME CPU_AVG(MIN) CPU_AVG(AVG)  \
0  0-0x000002                   mixin.0.1        1.991        1.996   
1  0-0x000004      host-copier.0.playback        2.903        2.904   
2  0-0x000006                    gain.0.1        1.605        1.615   
3  1-0x000003                  mixout.1.1        1.230        1.235   
4  1-0x010004  alh-copier.SDW0-Playback.0        5.586        5.588   
5  1-0x010006                    gain.1.1        1.746        1.761   

  CPU_AVG(MAX) CPU_PEAK(MIN) CPU_PEAK(AVG) CPU_PEAK(MAX) PEAK(MAX)/AVG(AVG)  \
0        2.004         2.062         2.068         2.070              1.037   
1        2.905         2.968         3.285         4.551              1.567   
2        1.655         1.884         2.410         4.444              2.751   
3        1.237         1.250         1.355         1.742              1.411   
4        5.592         5.624         5.647         5.692              1.019   
5        1.802         2.056         2.599         4.680              2.658   

   MODULE_CPC  
0        2993  
1        4356  
2        2422  
3        1852  
4        8381  
5        2640  
kv2019i commented 1 week ago

https://github.com/thesofproject/sof/pull/9661 merged, rebased this PR, marking ready for review.