joular / joularjx

JoularJX is a Java-based agent for software power monitoring at the source code level.
https://www.noureddine.org/research/joular/joularjx
GNU General Public License v3.0
76 stars 18 forks source link

Mac support: Program consumed 0 joules #54

Closed alamers closed 10 months ago

alamers commented 10 months ago

Hi!

I was trying out JoularJX on a Spring Boot application, on a macbook pro (intel) and I don't seem to be able to get it working. All measured values are 0.

I'm running the app as root, just to make sure I don't have any permission problems:

sudo java -javaagent:/Users/arjanl/tmp/joularjx/target/joularjx-2.8.1.jar --add-opens java.base/java.lang=ALL-UNNAMED -jar target/about-payments-web-6.0.0-SNAPSHOT.jar com.aboutpayments.AboutPaymentsApplication --spring.profiles.active=local

JoularJX seems to start up fine:

15/01/2024 10:21:02.449 - [INFO] - +---------------------------------+
15/01/2024 10:21:02.450 - [INFO] - | JoularJX Agent Version 2.8.1    |
15/01/2024 10:21:02.450 - [INFO] - +---------------------------------+
15/01/2024 10:21:02.474 - [INFO] - Results will be stored in joularjx-result/51590-1705310462470/
15/01/2024 10:21:02.486 - [INFO] - Initializing for platform: 'mac os x' running on architecture: 'x86_64'
15/01/2024 10:21:02.489 - [INFO] - Please wait while initializing JoularJX...
15/01/2024 10:21:03.543 - [INFO] - Initialization finished
15/01/2024 10:21:03.543 - [INFO] - Started monitoring application with ID 51590

  .   ____          _            __ _ _
 /\\ / ___'_ __ _ _(_)_ __  __ _ \ \ \ \
( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
 \\/  ___)| |_)| | | | | || (_| |  ) ) ) )
  '  |____| .__|_| |_|_| |_\__, | / / / /
 =========|_|==============|___/=/_/_/_/
 :: Spring Boot ::               (v2.7.14)

2024-01-15 10:21:05,632 [JoularJX Agent Thread][:] INFO  c.a.AboutPaymentsApplication Starting AboutPaymentsApplication v6.0.0-SNAPSHOT using Java 17.0.7 on sune.local with PID 51590 (/Users/arjanl/workspaces/workspace-aboutpayments/about-payments/about-payments-web/target/about-payments-web-6.0.0-SNAPSHOT.jar started by root in /Users/arjanl/workspaces/workspace-aboutpayments/about-payments/about-payments-web)
2024-01-15 10:21:05,638 [JoularJX Agent Thread][:] DEBUG c.a.AboutPaymentsApplication Running with Spring Boot v2.7.14, Spring v5.3.29

But when I terminate the program:

15/01/2024 10:22:22.026 - [INFO] - JoularJX finished monitoring application with ID 51590
15/01/2024 10:22:22.026 - [INFO] - Program consumed 0 joules

Needless to say, all output files are empty or contain 0 values.

Am I missing something?

adelnoureddine commented 10 months ago

Hi, could you run the powermetrics command as sudo in the terminal and show its output.

alamers commented 10 months ago

Hi,

This is the output (on stdout) that I get with sudo powermetrics -n 1. (I've truncated the processes a bit for privacy reasons, let me know if that's a problem).

Note, I do get a message on stderr: proc_pidpath 1460 failed(0), where 1460 is the pid of MS Teams (that I happen to have running).

Machine model: MacBookPro15,1
SMC version: Unknown
EFI version: 2020.1.0
OS version: 23C71
Boot arguments: 
Boot time: Thu Dec 21 08:07:54 2023

*** Sampled system activity (Mon Jan 15 12:58:51 2024 +0100) (5027.27ms elapsed) ***

*** Running tasks ***

Name                               ID     CPU ms/s  User%  Deadlines (<2 ms, 2-5 ms)  Wakeups (Intr, Pkg idle)
idea                               68365  123.36    67.93  872.41  1.39               1047.41 554.43            
java                               25330  3.12      66.35  0.00    0.00               46.53   21.08             
ALL_TASKS                          -2     812.00    57.80  1191.10 141.23             3346.15 1359.78           

**** Battery and backlight usage ****

Battery: percent_charge: 6064
Backlight level: 894 (range 0-1024)
Keyboard Backlight level: 0 (off 0 on range 32-512)

**** Network activity ****

out: 7.16 packets/s, 609.87 bytes/s
in:  24.67 packets/s, 9028.55 bytes/s

**** Disk activity ****

read: 11.34 ops/s 247.69 KBytes/s
write: 48.54 ops/s 865.27 KBytes/s

****  Interrupt distribution ****

CPU 0:
    Vector 0x46(SMC): 18.90 interrupts/sec
    Vector 0x54(URT0): 7.56 interrupts/sec
    Vector 0x72(XHC1): 0.40 interrupts/sec
    Vector 0x78(XHC2): 55.30 interrupts/sec
    Vector 0x79(ANS2): 29.44 interrupts/sec
    Vector 0x7a(ARPT): 48.14 interrupts/sec
    Vector 0x7e(IGPU): 52.71 interrupts/sec
    Vector 0x8c(IOBC): 224.38 interrupts/sec
    Vector 0xd6(): 5.57 interrupts/sec
    Vector 0xdd(TMR): 775.37 interrupts/sec
    Vector 0xde(IPI): 194.14 interrupts/sec
    Vector 0xdf(PMI): 0.20 interrupts/sec
CPU 1:
    Vector 0xdd(TMR): 14.72 interrupts/sec
    Vector 0xde(IPI): 112.98 interrupts/sec
CPU 2:
    Vector 0xd6(): 5.97 interrupts/sec
    Vector 0xdd(TMR): 265.15 interrupts/sec
    Vector 0xde(IPI): 242.28 interrupts/sec
    Vector 0xdf(PMI): 0.20 interrupts/sec
CPU 3:
    Vector 0xd6(): 0.40 interrupts/sec
    Vector 0xdd(TMR): 12.13 interrupts/sec
    Vector 0xde(IPI): 86.93 interrupts/sec
CPU 4:
    Vector 0xd6(): 2.98 interrupts/sec
    Vector 0xdd(TMR): 189.17 interrupts/sec
    Vector 0xde(IPI): 144.21 interrupts/sec
CPU 5:
    Vector 0xd6(): 0.40 interrupts/sec
    Vector 0xdd(TMR): 8.95 interrupts/sec
    Vector 0xde(IPI): 106.82 interrupts/sec
CPU 6:
    Vector 0xd6(): 1.79 interrupts/sec
    Vector 0xdd(TMR): 120.94 interrupts/sec
    Vector 0xde(IPI): 110.60 interrupts/sec
CPU 7:
    Vector 0xd6(): 0.20 interrupts/sec
    Vector 0xdd(TMR): 12.33 interrupts/sec
    Vector 0xde(IPI): 57.09 interrupts/sec
CPU 8:
    Vector 0xd6(): 2.19 interrupts/sec
    Vector 0xdd(TMR): 99.86 interrupts/sec
    Vector 0xde(IPI): 110.99 interrupts/sec
CPU 9:
    Vector 0xdd(TMR): 8.75 interrupts/sec
    Vector 0xde(IPI): 52.12 interrupts/sec
CPU 10:
    Vector 0xd6(): 2.19 interrupts/sec
    Vector 0xdd(TMR): 71.41 interrupts/sec
    Vector 0xde(IPI): 83.35 interrupts/sec
CPU 11:
    Vector 0xdd(TMR): 8.75 interrupts/sec
    Vector 0xde(IPI): 29.84 interrupts/sec

**** Processor usage ****

Intel energy model derived package power (CPUs+GT+SA): 5.91W

LLC flushed residency: 38.3%

System Average frequency as fraction of nominal: 88.84% (2309.71 Mhz)
Package 0 C-state residency: 42.10% (C2: 28.46% C3: 13.64% C6: 0.00% C7: 0.00% C8: 0.00% C9: 0.00% C10: 0.00% )

Performance Limited Due to:
CPU LIMIT ICCMAX/PL4/OTHER
CPU LIMIT MAX_TURBO_LIMIT
CPU LIMIT TURBO_ATTENUATION
GPU LIMIT ICCMAX/PL4/OTHER
CPU/GPU Overlap: 3.08%
Cores Active: 49.90%
GPU Active: 5.28%
Avg Num of Cores Active: 0.93

Core 0 C-state residency: 65.69% (C3: 0.92% C6: 0.00% C7: 64.77% )

CPU 0 duty cycles/s: active/idle [< 16 us: 356.65/216.02] [< 32 us: 203.49/24.47] [< 64 us: 193.15/250.43] [< 128 us: 388.48/229.95] [< 256 us: 201.30/135.06] [< 512 us: 126.71/198.72] [< 1024 us: 37.79/213.44] [< 2048 us: 15.32/221.99] [< 4096 us: 10.34/42.77] [< 8192 us: 3.78/6.37] [< 16384 us: 1.79/0.00] [< 32768 us: 0.40/0.00] 
CPU Average frequency as fraction of nominal: 85.24% (2216.21 Mhz)

CPU 1 duty cycles/s: active/idle [< 16 us: 1611.21/176.24] [< 32 us: 19.10/130.89] [< 64 us: 21.88/206.87] [< 128 us: 20.29/245.26] [< 256 us: 5.77/193.15] [< 512 us: 0.20/185.59] [< 1024 us: 0.40/181.41] [< 2048 us: 0.00/252.82] [< 4096 us: 0.00/86.53] [< 8192 us: 0.00/18.10] [< 16384 us: 0.00/1.59] [< 32768 us: 0.00/0.40] 
CPU Average frequency as fraction of nominal: 83.49% (2170.86 Mhz)

Core 1 C-state residency: 74.55% (C3: 1.98% C6: 0.00% C7: 72.57% )

CPU 2 duty cycles/s: active/idle [< 16 us: 608.88/294.99] [< 32 us: 227.96/26.85] [< 64 us: 241.08/230.34] [< 128 us: 266.35/255.61] [< 256 us: 136.26/158.54] [< 512 us: 69.02/142.22] [< 1024 us: 24.27/177.23] [< 2048 us: 9.75/226.56] [< 4096 us: 5.17/68.03] [< 8192 us: 3.98/13.53] [< 16384 us: 0.99/0.40] [< 32768 us: 0.40/0.00] 
CPU Average frequency as fraction of nominal: 88.04% (2289.01 Mhz)

CPU 3 duty cycles/s: active/idle [< 16 us: 1381.07/277.29] [< 32 us: 16.31/154.76] [< 64 us: 18.90/173.85] [< 128 us: 18.90/164.70] [< 256 us: 5.57/132.88] [< 512 us: 0.40/117.16] [< 1024 us: 0.40/115.37] [< 2048 us: 0.00/145.21] [< 4096 us: 0.20/110.99] [< 8192 us: 0.00/46.55] [< 16384 us: 0.00/2.98] [< 32768 us: 0.00/0.00] 
CPU Average frequency as fraction of nominal: 90.39% (2350.02 Mhz)

Core 2 C-state residency: 80.18% (C3: 3.05% C6: 0.00% C7: 77.13% )

CPU 4 duty cycles/s: active/idle [< 16 us: 721.66/268.54] [< 32 us: 174.65/48.34] [< 64 us: 192.15/201.70] [< 128 us: 177.83/241.88] [< 256 us: 125.12/153.76] [< 512 us: 56.69/118.75] [< 1024 us: 17.70/149.78] [< 2048 us: 6.96/176.24] [< 4096 us: 2.19/92.69] [< 8192 us: 3.38/26.26] [< 16384 us: 0.20/0.80] [< 32768 us: 0.00/0.00] 
CPU Average frequency as fraction of nominal: 90.14% (2343.72 Mhz)

CPU 5 duty cycles/s: active/idle [< 16 us: 1435.97/277.69] [< 32 us: 17.50/162.12] [< 64 us: 18.30/167.09] [< 128 us: 15.71/174.85] [< 256 us: 5.37/149.39] [< 512 us: 0.80/120.34] [< 1024 us: 0.40/123.73] [< 2048 us: 0.20/171.07] [< 4096 us: 0.20/102.64] [< 8192 us: 0.00/42.17] [< 16384 us: 0.00/3.38] [< 32768 us: 0.00/0.00] 
CPU Average frequency as fraction of nominal: 94.98% (2469.40 Mhz)

Core 3 C-state residency: 83.67% (C3: 1.34% C6: 0.00% C7: 82.33% )

CPU 6 duty cycles/s: active/idle [< 16 us: 659.60/194.54] [< 32 us: 99.66/36.40] [< 64 us: 141.03/172.46] [< 128 us: 115.57/176.44] [< 256 us: 98.86/119.95] [< 512 us: 47.14/95.28] [< 1024 us: 14.32/109.60] [< 2048 us: 5.17/139.24] [< 4096 us: 2.59/102.84] [< 8192 us: 3.18/38.79] [< 16384 us: 0.40/1.99] [< 32768 us: 0.20/0.00] 
CPU Average frequency as fraction of nominal: 88.77% (2307.98 Mhz)

CPU 7 duty cycles/s: active/idle [< 16 us: 310.90/41.97] [< 32 us: 12.73/17.90] [< 64 us: 15.32/26.06] [< 128 us: 12.13/40.78] [< 256 us: 3.78/37.79] [< 512 us: 1.19/25.26] [< 1024 us: 0.60/23.87] [< 2048 us: 0.00/30.63] [< 4096 us: 0.00/32.82] [< 8192 us: 0.00/41.18] [< 16384 us: 0.00/28.05] [< 32768 us: 0.20/9.75] 
CPU Average frequency as fraction of nominal: 89.25% (2320.50 Mhz)

Core 4 C-state residency: 89.04% (C3: 0.46% C6: 0.00% C7: 88.57% )

CPU 8 duty cycles/s: active/idle [< 16 us: 431.65/136.26] [< 32 us: 79.76/10.14] [< 64 us: 102.84/95.88] [< 128 us: 96.47/118.75] [< 256 us: 64.05/78.37] [< 512 us: 28.05/57.69] [< 1024 us: 12.93/74.79] [< 2048 us: 3.38/98.86] [< 4096 us: 1.59/86.33] [< 8192 us: 1.79/56.89] [< 16384 us: 0.00/7.76] [< 32768 us: 0.20/0.80] 
CPU Average frequency as fraction of nominal: 91.12% (2369.08 Mhz)

CPU 9 duty cycles/s: active/idle [< 16 us: 651.25/104.63] [< 32 us: 9.75/84.74] [< 64 us: 14.32/90.31] [< 128 us: 13.33/81.95] [< 256 us: 3.98/66.04] [< 512 us: 0.60/43.16] [< 1024 us: 0.80/39.19] [< 2048 us: 0.00/52.51] [< 4096 us: 0.00/52.51] [< 8192 us: 0.00/49.93] [< 16384 us: 0.00/22.88] [< 32768 us: 0.00/5.77] 
CPU Average frequency as fraction of nominal: 90.66% (2357.23 Mhz)

Core 5 C-state residency: 90.42% (C3: 0.19% C6: 0.00% C7: 90.22% )

CPU 10 duty cycles/s: active/idle [< 16 us: 392.26/123.53] [< 32 us: 50.13/28.84] [< 64 us: 77.58/82.15] [< 128 us: 71.01/87.12] [< 256 us: 46.55/46.94] [< 512 us: 24.67/41.77] [< 1024 us: 9.95/50.13] [< 2048 us: 4.58/75.59] [< 4096 us: 0.99/69.82] [< 8192 us: 1.19/54.90] [< 16384 us: 0.20/16.71] [< 32768 us: 0.20/1.99] 
CPU Average frequency as fraction of nominal: 99.29% (2581.63 Mhz)

CPU 11 duty cycles/s: active/idle [< 16 us: 49.73/3.38] [< 32 us: 8.16/1.19] [< 64 us: 12.53/5.57] [< 128 us: 8.55/5.37] [< 256 us: 2.78/3.78] [< 512 us: 0.60/3.18] [< 1024 us: 0.40/3.98] [< 2048 us: 0.20/7.56] [< 4096 us: 0.00/6.96] [< 8192 us: 0.00/8.75] [< 16384 us: 0.00/11.54] [< 32768 us: 0.00/12.13] 
CPU Average frequency as fraction of nominal: 92.18% (2396.78 Mhz)

**** GPU usage ****

GPU 0 name IntelIG
GPU 0 C-state residency: 95.27% (0.66%, 94.60%)
GPU 0 P-state residency: 1150MHz: 0.00%, 1100MHz: 0.00%, 1050MHz: 0.00%, 1000MHz: 0.00%, 950MHz: 0.00%, 900MHz: 0.00%, 850MHz: 0.00%, 800MHz: 0.00%, 750MHz: 0.00%, 700MHz: 0.00%, 650MHz: 0.00%, 600MHz: 0.00%, 550MHz: 0.00%, 500MHz: 0.00%, 450MHz: 0.00%, 400MHz: 0.00%, 350MHz: 4.73%
GPU 0 average active frequency as fraction of nominal (350.00Mhz): 100.00% (350.00Mhz)
GPU 0 HW average active frequency   : 0.00%
GPU 0 GPU Busy                      : 4.73%
GPU 0 DC6 Residency                 : 0.00%
GPU 0 [PSR] GPU + TCON are Off      : 0.00%
GPU 0 [PSR] Only GPU is On          : 100.00%
GPU 0 [PSR] Only TCON is On         : 0.00%
GPU 0 [PSR] GPU + TCON are On       : 0.00%
GPU 0 [PSR] StateMachine Bypass     : 100.00%
GPU 0 [PSR] StateMachine FIFO       : 0.00%
GPU 0 [PSR] StateMachine Others     : 0.00%
GPU 0 DPB Strong On                 : 0.00%
GPU 0 DPB Weak On                   : 0.00%
GPU 0 PPFM on                       : 0.00%
GPU 0 Throttle High Priority(%): 0
GPU 0 Throttle NormalHi Priority(%): 0
GPU 0 Throttle Normal Priority(%): 0
GPU 0 Throttle Low Priority(%): 0
GPU 0 Slice switch                  : 0 (0.00/second)
GPU 0 DC6 Exit Reason - Flip: 0 (0.00/second)
GPU 0 DC6 Exit Reason - Register: 0 (0.00/second)
GPU 0 DC6 Exit Reason - Gamma: 0 (0.00/second)
GPU 0 DC6 Exit Reason - Interrupt: 0 (0.00/second)
GPU 0 DC6 Exit Reason - Cursor: 0 (0.00/second)
GPU 0 DC6 Exit Reason - Render: 0 (0.00/second)
GPU 0 [INT] VBLANK_A           : 0 (0.00/second)
GPU 0 [INT] VBLANK_B           : 0 (0.00/second)
GPU 0 [INT] VBLANK_C           : 0 (0.00/second)
GPU 0 [INT] PRIMARY_FLIP_A     : 0 (0.00/second)
GPU 0 [INT] PRIMARY_FLIP_B     : 0 (0.00/second)
GPU 0 [INT] PRIMARY_FLIP_C     : 0 (0.00/second)
GPU 0 [INT] SPRITE_FLIP_A      : 0 (0.00/second)
GPU 0 [INT] SPRITE_FLIP_B      : 0 (0.00/second)
GPU 0 [INT] SPRITE_FLIP_C      : 0 (0.00/second)
GPU 0 [INT] VIDEO_USER_1       : 0 (0.00/second)
GPU 0 [INT] VIDEO_USER_2       : 0 (0.00/second)
GPU 0 [INT] VEBOX_USER         : 0 (0.00/second)
GPU 0 [INT] RENDOR_USER        : 0 (0.00/second)
GPU 0 [INT] BLITTER_USER       : 0 (0.00/second)
GPU 0 [INT] GPU_PARSER         : 0 (0.00/second)
GPU 0 [INT] HOTPLUG_DP_A       : 0 (0.00/second)
GPU 0 [INT] HOTPLUG_DP_B       : 0 (0.00/second)
GPU 0 [INT] HOTPLUG_DP_C       : 0 (0.00/second)
GPU 0 [INT] HOTPLUG_DP_D       : 0 (0.00/second)
GPU 0 [INT] SHORTPULSE_DP_A    : 0 (0.00/second)
GPU 0 [INT] SHORTPULSE_DP_B    : 0 (0.00/second)
GPU 0 [INT] SHORTPULSE_DP_C    : 0 (0.00/second)
GPU 0 [INT] SHORTPULSE_DP_D    : 0 (0.00/second)
GPU 0 [INT] UP_THRESHOLD       : 0 (0.00/second)
GPU 0 [INT] DOWN_THRESHOLD     : 0 (0.00/second)
GPU 0 [INT] SRD_INTERRUPT      : 0 (0.00/second)
GPU 0 [INT] PSR_EXIT_TRIGGER   : 0 (0.00/second)
GPU 0 [INT] BCS_CONTEXT_SWITCH : 0 (0.00/second)
GPU 0 [INT] CS_CONTEXT_SWITCH  : 0 (0.00/second)
GPU 0 [INT] VCS1_CONTEXT_SWITCH: 0 (0.00/second)
GPU 0 [INT] VCS2_CONTEXT_SWITCH: 0 (0.00/second)
GPU 0 [INT] VECS_CONTEXT_SWITCH: 0 (0.00/second)
GPU 0 [INT] PIPEA_UNDERRUN     : 0 (0.00/second)
GPU 0 [INT] PIPEB_UNDERRUN     : 0 (0.00/second)
GPU 0 [INT] PIPEC_UNDERRUN     : 0 (0.00/second)
GPU 0 [INT] RCS_ERROR          : 0 (0.00/second)
GPU 0 [INT] BCS_ERROR          : 0 (0.00/second)
GPU 0 [INT] VCS1_ERROR         : 0 (0.00/second)
GPU 0 [INT] VCS2_ERROR         : 0 (0.00/second)
GPU 0 [INT] VECS_ERROR         : 0 (0.00/second)
GPU 0 [INT] RCS_FLUSH_NOTIFY   : 107 (21.28/second)
GPU 0 [INT] BCS_FLUSH_NOTIFY   : 126 (25.06/second)
GPU 0 [INT] VCS1_FLUSH_NOTIFY  : 0 (0.00/second)
GPU 0 [INT] VCS2_FLUSH_NOTIFY  : 0 (0.00/second)
GPU 0 [INT] VECS_FLUSH_NOTIFY  : 0 (0.00/second)
GPU 0 [INT] GUC_TO_HOST_INT    : 0 (0.00/second)
GPU 0 [INT] SW_INT_6           : 31 (6.16/second)
GPU 0 FB Test Case 0

**** AGPM Stats ****

Accelerator Type: IntelAccelerator
Plimit: 0.00 (GPU Frequency headroom limited on average by 0 MHz)

Accelerator Type: AMDRadeonX4000_AMDBaffinGraphicsAccelerator
Plimit: 0.00 (GPU Frequency headroom N/A)

**** SMC sensors ****

CPU Thermal level: 0
GPU Thermal level: 0
IO Thermal level: 0
Fan: 2006.01 rpm
CPU die temperature: 58.15 C
GPU die temperature: 51.00 C
CPU Plimit: 0.00
GPU Plimit (Int): 0.00 
GPU2 Plimit (Ext1): 0.00 
Number of prochots: 0
adelnoureddine commented 10 months ago

Thanks. Could you run this command instead: sudo powermetrics --samplers cpu_power -i 1000 -n 1

adelnoureddine commented 10 months ago

We're usually looking for CPU/GPU or combined power lines on recent ARM Macs, which I don't see from the output you gave. I guess on Intel Mac the output might be slightly different as I see Intel energy model derived package power. In this case, we need to search and read that in JoularJX.

alamers commented 10 months ago

I ran: sudo powermetrics --samplers cpu_power -i 1000 -n 1:

Machine model: MacBookPro15,1
SMC version: Unknown
EFI version: 2020.1.0
OS version: 23C71
Boot arguments: 
Boot time: Thu Dec 21 08:07:54 2023

*** Sampled system activity (Mon Jan 15 18:21:32 2024 +0100) (1002.38ms elapsed) ***

**** Processor usage ****

Intel energy model derived package power (CPUs+GT+SA): 8.53W

LLC flushed residency: 26.3%

System Average frequency as fraction of nominal: 96.10% (2498.72 Mhz)
Package 0 C-state residency: 32.95% (C2: 24.47% C3: 8.48% C6: 0.00% C7: 0.00% C8: 0.00% C9: 0.00% C10: 0.00% )

Performance Limited Due to:
CPU LIMIT MAX_TURBO_LIMIT
CPU LIMIT TURBO_ATTENUATION
CPU/GPU Overlap: 6.67%
Cores Active: 57.32%
GPU Active: 8.24%
Avg Num of Cores Active: 1.17

Core 0 C-state residency: 56.98% (C3: 0.77% C6: 0.00% C7: 56.21% )

CPU 0 duty cycles/s: active/idle [< 16 us: 525.75/399.05] [< 32 us: 362.14/40.90] [< 64 us: 320.24/389.07] [< 128 us: 425.99/318.24] [< 256 us: 185.56/189.55] [< 512 us: 113.73/264.37] [< 1024 us: 52.87/228.46] [< 2048 us: 28.93/190.55] [< 4096 us: 20.95/23.94] [< 8192 us: 6.98/1.00] [< 16384 us: 1.00/0.00] [< 32768 us: 0.00/0.00] 
CPU Average frequency as fraction of nominal: 92.36% (2401.36 Mhz)

CPU 1 duty cycles/s: active/idle [< 16 us: 2227.70/424.99] [< 32 us: 21.95/207.51] [< 64 us: 11.97/274.35] [< 128 us: 6.98/318.24] [< 256 us: 0.00/255.39] [< 512 us: 0.00/239.43] [< 1024 us: 0.00/209.50] [< 2048 us: 0.00/243.42] [< 4096 us: 0.00/78.81] [< 8192 us: 0.00/14.96] [< 16384 us: 0.00/2.00] [< 32768 us: 0.00/0.00] 
CPU Average frequency as fraction of nominal: 92.70% (2410.30 Mhz)

Core 1 C-state residency: 69.86% (C3: 3.76% C6: 0.00% C7: 66.10% )

CPU 2 duty cycles/s: active/idle [< 16 us: 896.87/214.49] [< 32 us: 185.56/23.94] [< 64 us: 279.34/363.14] [< 128 us: 228.46/413.02] [< 256 us: 151.64/227.46] [< 512 us: 93.78/206.51] [< 1024 us: 34.92/212.49] [< 2048 us: 14.96/170.59] [< 4096 us: 12.97/63.85] [< 8192 us: 5.99/7.98] [< 16384 us: 0.00/0.00] [< 32768 us: 0.00/0.00] 
CPU Average frequency as fraction of nominal: 97.43% (2533.19 Mhz)

CPU 3 duty cycles/s: active/idle [< 16 us: 1571.26/187.55] [< 32 us: 36.91/107.74] [< 64 us: 21.95/204.51] [< 128 us: 9.98/237.43] [< 256 us: 2.00/226.46] [< 512 us: 0.00/214.49] [< 1024 us: 0.00/169.60] [< 2048 us: 0.00/154.63] [< 4096 us: 0.00/105.75] [< 8192 us: 0.00/31.92] [< 16384 us: 0.00/2.00] [< 32768 us: 0.00/0.00] 
CPU Average frequency as fraction of nominal: 92.97% (2417.26 Mhz)

Core 2 C-state residency: 72.83% (C3: 9.76% C6: 0.00% C7: 63.08% )

CPU 4 duty cycles/s: active/idle [< 16 us: 1075.44/409.03] [< 32 us: 245.42/35.91] [< 64 us: 254.39/378.10] [< 128 us: 206.51/374.11] [< 256 us: 132.68/240.43] [< 512 us: 86.79/196.53] [< 1024 us: 40.90/166.60] [< 2048 us: 11.97/181.57] [< 4096 us: 6.98/62.85] [< 8192 us: 2.99/18.95] [< 16384 us: 0.00/0.00] [< 32768 us: 0.00/0.00] 
CPU Average frequency as fraction of nominal: 95.96% (2495.08 Mhz)

CPU 5 duty cycles/s: active/idle [< 16 us: 2164.85/478.86] [< 32 us: 18.95/251.40] [< 64 us: 21.95/276.34] [< 128 us: 5.99/252.40] [< 256 us: 2.99/248.41] [< 512 us: 1.00/217.48] [< 1024 us: 1.00/198.53] [< 2048 us: 0.00/165.61] [< 4096 us: 0.00/95.77] [< 8192 us: 0.00/29.93] [< 16384 us: 0.00/2.00] [< 32768 us: 0.00/0.00] 
CPU Average frequency as fraction of nominal: 96.28% (2503.22 Mhz)

Core 3 C-state residency: 78.20% (C3: 12.12% C6: 0.00% C7: 66.08% )

CPU 6 duty cycles/s: active/idle [< 16 us: 1069.45/235.44] [< 32 us: 138.67/112.73] [< 64 us: 203.52/335.20] [< 128 us: 156.63/311.26] [< 256 us: 103.75/192.54] [< 512 us: 52.87/160.62] [< 1024 us: 24.94/173.59] [< 2048 us: 10.97/141.66] [< 4096 us: 6.98/79.81] [< 8192 us: 2.99/27.93] [< 16384 us: 1.00/1.00] [< 32768 us: 0.00/0.00] 
CPU Average frequency as fraction of nominal: 98.06% (2549.61 Mhz)

CPU 7 duty cycles/s: active/idle [< 16 us: 484.85/25.94] [< 32 us: 19.95/26.94] [< 64 us: 17.96/49.88] [< 128 us: 6.98/51.88] [< 256 us: 2.00/62.85] [< 512 us: 1.00/65.84] [< 1024 us: 0.00/59.86] [< 2048 us: 0.00/49.88] [< 4096 us: 0.00/62.85] [< 8192 us: 0.00/47.89] [< 16384 us: 0.00/25.94] [< 32768 us: 0.00/2.99] 
CPU Average frequency as fraction of nominal: 95.57% (2484.86 Mhz)

Core 4 C-state residency: 84.42% (C3: 5.33% C6: 0.00% C7: 79.09% )

CPU 8 duty cycles/s: active/idle [< 16 us: 785.13/145.65] [< 32 us: 112.73/11.97] [< 64 us: 146.65/202.52] [< 128 us: 123.71/274.35] [< 256 us: 78.81/161.62] [< 512 us: 53.87/139.67] [< 1024 us: 27.93/138.67] [< 2048 us: 5.99/135.68] [< 4096 us: 1.00/90.78] [< 8192 us: 2.00/32.92] [< 16384 us: 0.00/4.99] [< 32768 us: 0.00/0.00] 
CPU Average frequency as fraction of nominal: 95.59% (2485.23 Mhz)

CPU 9 duty cycles/s: active/idle [< 16 us: 947.74/86.79] [< 32 us: 16.96/82.80] [< 64 us: 14.96/135.68] [< 128 us: 13.97/124.70] [< 256 us: 2.99/124.70] [< 512 us: 0.00/97.77] [< 1024 us: 0.00/106.75] [< 2048 us: 0.00/96.77] [< 4096 us: 0.00/67.84] [< 8192 us: 0.00/54.87] [< 16384 us: 0.00/16.96] [< 32768 us: 0.00/1.00] 
CPU Average frequency as fraction of nominal: 94.94% (2468.49 Mhz)

Core 5 C-state residency: 85.97% (C3: 2.32% C6: 0.00% C7: 83.65% )

CPU 10 duty cycles/s: active/idle [< 16 us: 679.38/94.77] [< 32 us: 61.85/3.99] [< 64 us: 117.72/141.66] [< 128 us: 88.79/205.51] [< 256 us: 46.89/116.72] [< 512 us: 30.93/126.70] [< 1024 us: 13.97/110.74] [< 2048 us: 7.98/119.72] [< 4096 us: 2.00/80.81] [< 8192 us: 3.99/48.88] [< 16384 us: 1.00/3.99] [< 32768 us: 0.00/1.00] 
CPU Average frequency as fraction of nominal: 103.96% (2702.93 Mhz)

CPU 11 duty cycles/s: active/idle [< 16 us: 114.73/6.98] [< 32 us: 13.97/2.00] [< 64 us: 12.97/7.98] [< 128 us: 9.98/16.96] [< 256 us: 4.99/13.97] [< 512 us: 1.00/5.99] [< 1024 us: 0.00/12.97] [< 2048 us: 0.00/9.98] [< 4096 us: 0.00/17.96] [< 8192 us: 0.00/23.94] [< 16384 us: 0.00/21.95] [< 32768 us: 0.00/11.97] 
CPU Average frequency as fraction of nominal: 96.93% (2520.05 Mhz)

Hope it helps, and let me know if I can help :)

alamers commented 10 months ago

I think that output was already in the first run as well, so just to be complete, I ran sudo powermetrics --samplers all -i 1000 -n 1:

Machine model: MacBookPro15,1
SMC version: Unknown
EFI version: 2020.1.0
OS version: 23C71
Boot arguments: 
Boot time: Thu Dec 21 08:07:54 2023

proc_pidpath 1460 failed(0)
proc_pidpath 1460 failed(0)

*** Sampled system activity (Mon Jan 15 18:30:35 2024 +0100) (1027.49ms elapsed) ***

*** Running tasks ***

Name                               ID     CPU ms/s  User%  Deadlines (<2 ms, 2-5 ms)  Wakeups (Intr, Pkg idle)
WindowServer                       563    286.11    76.20  39.88   0.97               109.92  75.87             
idea                               68365  189.70    70.62  890.04  1.95               1095.28 489.28            
ALL_TASKS                          -2     911.59    59.39  1105.61 55.48              3374.26 1341.14           

**** Battery and backlight usage ****

Battery: percent_charge: 6325
Backlight level: 910 (range 0-1024)
Keyboard Backlight level: 104 (off 0 on range 32-512)

**** Network activity ****

out: 3.89 packets/s, 375.67 bytes/s
in:  22.38 packets/s, 9496.97 bytes/s

**** Disk activity ****

read: 0.00 ops/s 0.00 KBytes/s
write: 0.97 ops/s 310.94 KBytes/s

****  Interrupt distribution ****

CPU 0:
    Vector 0x46(SMC): 21.41 interrupts/sec
    Vector 0x54(URT0): 0.97 interrupts/sec
    Vector 0x72(XHC1): 0.97 interrupts/sec
    Vector 0x78(XHC2): 60.34 interrupts/sec
    Vector 0x79(ANS2): 3.89 interrupts/sec
    Vector 0x7a(ARPT): 59.37 interrupts/sec
    Vector 0x7e(IGPU): 196.60 interrupts/sec
    Vector 0x8c(IOBC): 171.29 interrupts/sec
    Vector 0xd6(): 8.76 interrupts/sec
    Vector 0xdd(TMR): 830.18 interrupts/sec
    Vector 0xde(IPI): 347.45 interrupts/sec
CPU 1:
    Vector 0xdd(TMR): 1.95 interrupts/sec
    Vector 0xde(IPI): 74.94 interrupts/sec
CPU 2:
    Vector 0xd6(): 1.95 interrupts/sec
    Vector 0xdd(TMR): 252.07 interrupts/sec
    Vector 0xde(IPI): 327.01 interrupts/sec
CPU 3:
    Vector 0xd6(): 0.97 interrupts/sec
    Vector 0xdd(TMR): 1.95 interrupts/sec
    Vector 0xde(IPI): 92.46 interrupts/sec
CPU 4:
    Vector 0xd6(): 3.89 interrupts/sec
    Vector 0xdd(TMR): 189.78 interrupts/sec
    Vector 0xde(IPI): 320.20 interrupts/sec
CPU 5:
    Vector 0xdd(TMR): 8.76 interrupts/sec
    Vector 0xde(IPI): 86.62 interrupts/sec
CPU 6:
    Vector 0xd6(): 0.97 interrupts/sec
    Vector 0xdd(TMR): 179.08 interrupts/sec
    Vector 0xde(IPI): 239.42 interrupts/sec
CPU 7:
    Vector 0xdd(TMR): 0.97 interrupts/sec
    Vector 0xde(IPI): 51.58 interrupts/sec
CPU 8:
    Vector 0xd6(): 1.95 interrupts/sec
    Vector 0xdd(TMR): 91.49 interrupts/sec
    Vector 0xde(IPI): 175.18 interrupts/sec
CPU 9:
    Vector 0xd6(): 1.95 interrupts/sec
    Vector 0xdd(TMR): 7.79 interrupts/sec
    Vector 0xde(IPI): 77.86 interrupts/sec
CPU 10:
    Vector 0xd6(): 1.95 interrupts/sec
    Vector 0xdd(TMR): 68.13 interrupts/sec
    Vector 0xde(IPI): 131.39 interrupts/sec
CPU 11:
    Vector 0xdd(TMR): 5.84 interrupts/sec
    Vector 0xde(IPI): 38.93 interrupts/sec

**** Processor usage ****

Intel energy model derived package power (CPUs+GT+SA): 7.38W

LLC flushed residency: 23.7%

System Average frequency as fraction of nominal: 87.69% (2279.94 Mhz)
Package 0 C-state residency: 35.85% (C2: 25.40% C3: 10.45% C6: 0.00% C7: 0.00% C8: 0.00% C9: 0.00% C10: 0.00% )
CPU/GPU Overlap: 14.04%
Cores Active: 53.45%
GPU Active: 17.07%
Avg Num of Cores Active: 1.08

Core 0 C-state residency: 57.72% (C3: 2.11% C6: 0.00% C7: 55.61% )

CPU 0 duty cycles/s: active/idle [< 16 us: 462.29/296.84] [< 32 us: 217.03/27.25] [< 64 us: 274.46/330.90] [< 128 us: 351.34/260.83] [< 256 us: 190.76/177.13] [< 512 us: 109.00/224.82] [< 1024 us: 54.50/173.24] [< 2048 us: 17.52/174.21] [< 4096 us: 19.46/41.85] [< 8192 us: 14.60/4.87] [< 16384 us: 0.00/0.00] [< 32768 us: 0.97/0.00] 
CPU Average frequency as fraction of nominal: 87.26% (2268.76 Mhz)

CPU 1 duty cycles/s: active/idle [< 16 us: 1890.05/327.01] [< 32 us: 19.46/123.60] [< 64 us: 12.65/243.31] [< 128 us: 3.89/299.76] [< 256 us: 0.00/217.03] [< 512 us: 1.95/213.14] [< 1024 us: 0.00/178.10] [< 2048 us: 0.00/202.44] [< 4096 us: 0.00/99.27] [< 8192 us: 0.00/22.38] [< 16384 us: 0.00/0.97] [< 32768 us: 0.00/0.97] 
CPU Average frequency as fraction of nominal: 86.51% (2249.25 Mhz)

Core 1 C-state residency: 72.79% (C3: 17.11% C6: 0.00% C7: 55.67% )

CPU 2 duty cycles/s: active/idle [< 16 us: 961.57/283.22] [< 32 us: 271.54/24.33] [< 64 us: 323.12/451.59] [< 128 us: 308.52/437.96] [< 256 us: 175.18/250.13] [< 512 us: 86.62/279.32] [< 1024 us: 40.88/200.49] [< 2048 us: 9.73/176.16] [< 4096 us: 5.84/71.05] [< 8192 us: 0.00/9.73] [< 16384 us: 0.00/0.00] [< 32768 us: 0.00/0.00] 
CPU Average frequency as fraction of nominal: 85.99% (2235.64 Mhz)

CPU 3 duty cycles/s: active/idle [< 16 us: 1862.80/251.10] [< 32 us: 32.12/161.56] [< 64 us: 16.55/275.43] [< 128 us: 6.81/286.14] [< 256 us: 3.89/255.96] [< 512 us: 0.97/246.23] [< 1024 us: 0.00/176.16] [< 2048 us: 0.00/135.28] [< 4096 us: 0.00/95.38] [< 8192 us: 0.00/37.96] [< 16384 us: 0.00/1.95] [< 32768 us: 0.00/0.00] 
CPU Average frequency as fraction of nominal: 85.73% (2228.87 Mhz)

Core 2 C-state residency: 73.03% (C3: 12.38% C6: 0.00% C7: 60.64% )

CPU 4 duty cycles/s: active/idle [< 16 us: 958.65/256.94] [< 32 us: 203.41/45.74] [< 64 us: 287.11/346.48] [< 128 us: 235.53/398.06] [< 256 us: 148.91/256.94] [< 512 us: 72.99/223.85] [< 1024 us: 31.14/160.59] [< 2048 us: 1.95/171.29] [< 4096 us: 2.92/72.99] [< 8192 us: 4.87/15.57] [< 16384 us: 1.95/0.97] [< 32768 us: 0.00/0.00] 
CPU Average frequency as fraction of nominal: 89.56% (2328.50 Mhz)

CPU 5 duty cycles/s: active/idle [< 16 us: 1963.04/268.62] [< 32 us: 36.01/204.38] [< 64 us: 14.60/271.54] [< 128 us: 3.89/313.39] [< 256 us: 3.89/267.64] [< 512 us: 0.97/228.71] [< 1024 us: 0.00/189.78] [< 2048 us: 0.00/158.64] [< 4096 us: 0.00/82.73] [< 8192 us: 0.00/34.06] [< 16384 us: 0.00/2.92] [< 32768 us: 0.00/0.00] 
CPU Average frequency as fraction of nominal: 85.64% (2226.74 Mhz)

Core 3 C-state residency: 79.39% (C3: 11.61% C6: 0.00% C7: 67.78% )

CPU 6 duty cycles/s: active/idle [< 16 us: 960.60/288.08] [< 32 us: 191.73/42.82] [< 64 us: 240.39/317.28] [< 128 us: 182.97/324.09] [< 256 us: 109.00/204.38] [< 512 us: 76.89/194.65] [< 1024 us: 24.33/156.69] [< 2048 us: 5.84/157.67] [< 4096 us: 4.87/90.51] [< 8192 us: 0.97/20.44] [< 16384 us: 0.00/1.95] [< 32768 us: 0.00/0.00] 
CPU Average frequency as fraction of nominal: 85.82% (2231.39 Mhz)

CPU 7 duty cycles/s: active/idle [< 16 us: 479.81/41.85] [< 32 us: 18.49/30.17] [< 64 us: 8.76/49.64] [< 128 us: 3.89/55.48] [< 256 us: 0.97/42.82] [< 512 us: 0.97/55.48] [< 1024 us: 0.00/52.56] [< 2048 us: 0.00/49.64] [< 4096 us: 0.00/62.29] [< 8192 us: 0.00/43.80] [< 16384 us: 0.00/22.38] [< 32768 us: 0.00/6.81] 
CPU Average frequency as fraction of nominal: 85.91% (2233.62 Mhz)

Core 4 C-state residency: 86.71% (C3: 2.65% C6: 0.00% C7: 84.07% )

CPU 8 duty cycles/s: active/idle [< 16 us: 715.34/115.82] [< 32 us: 93.43/7.79] [< 64 us: 145.99/171.29] [< 128 us: 124.58/227.74] [< 256 us: 72.02/152.80] [< 512 us: 49.64/153.77] [< 1024 us: 25.30/129.44] [< 2048 us: 0.00/127.50] [< 4096 us: 0.97/95.38] [< 8192 us: 0.00/43.80] [< 16384 us: 0.00/2.92] [< 32768 us: 0.00/0.00] 
CPU Average frequency as fraction of nominal: 88.70% (2306.31 Mhz)

CPU 9 duty cycles/s: active/idle [< 16 us: 767.89/64.23] [< 32 us: 16.55/70.07] [< 64 us: 21.41/86.62] [< 128 us: 4.87/121.66] [< 256 us: 0.97/94.41] [< 512 us: 1.95/82.73] [< 1024 us: 0.97/83.70] [< 2048 us: 0.00/73.97] [< 4096 us: 0.97/64.23] [< 8192 us: 0.00/49.64] [< 16384 us: 0.00/21.41] [< 32768 us: 0.00/2.92] 
CPU Average frequency as fraction of nominal: 95.97% (2495.30 Mhz)

Core 5 C-state residency: 89.57% (C3: 0.09% C6: 0.00% C7: 89.47% )

CPU 10 duty cycles/s: active/idle [< 16 us: 630.67/94.41] [< 32 us: 76.89/3.89] [< 64 us: 98.30/118.74] [< 128 us: 93.43/185.89] [< 256 us: 59.37/114.84] [< 512 us: 34.06/132.36] [< 1024 us: 14.60/114.84] [< 2048 us: 1.95/103.16] [< 4096 us: 0.97/84.67] [< 8192 us: 0.00/50.61] [< 16384 us: 0.97/7.79] [< 32768 us: 0.00/0.00] 
CPU Average frequency as fraction of nominal: 91.46% (2377.93 Mhz)

CPU 11 duty cycles/s: active/idle [< 16 us: 79.81/3.89] [< 32 us: 16.55/2.92] [< 64 us: 10.71/8.76] [< 128 us: 3.89/10.71] [< 256 us: 0.00/3.89] [< 512 us: 0.97/6.81] [< 1024 us: 0.00/6.81] [< 2048 us: 0.00/3.89] [< 4096 us: 0.00/8.76] [< 8192 us: 0.00/17.52] [< 16384 us: 0.00/19.46] [< 32768 us: 0.00/11.68] 
CPU Average frequency as fraction of nominal: 84.96% (2208.94 Mhz)

**** Thermal pressure ****

Current pressure level: Nominal

**** Selective Forced Idle ****

Selective Forced Idle window:     50000us

**** GPU usage ****

GPU 0 name IntelIG
GPU 0 C-state residency: 84.26% (1.71%, 82.55%)
GPU 0 P-state residency: 1150MHz: 0.00%, 1100MHz: 0.00%, 1050MHz: 0.00%, 1000MHz: 0.00%, 950MHz: 0.00%, 900MHz: 0.00%, 850MHz: 0.00%, 800MHz: 0.00%, 750MHz: 0.00%, 700MHz: 0.00%, 650MHz: 0.00%, 600MHz: 0.00%, 550MHz: 0.00%, 500MHz: 0.00%, 450MHz: 0.00%, 400MHz: 0.00%, 350MHz: 15.74%
GPU 0 average active frequency as fraction of nominal (350.00Mhz): 100.00% (350.00Mhz)
GPU 0 HW average active frequency   : 0.00%
GPU 0 GPU Busy                      : 15.74%
GPU 0 DC6 Residency                 : 0.00%
GPU 0 [PSR] GPU + TCON are Off      : 0.00%
GPU 0 [PSR] Only GPU is On          : 100.00%
GPU 0 [PSR] Only TCON is On         : 0.00%
GPU 0 [PSR] GPU + TCON are On       : 0.00%
GPU 0 [PSR] StateMachine Bypass     : 100.00%
GPU 0 [PSR] StateMachine FIFO       : 0.00%
GPU 0 [PSR] StateMachine Others     : 0.00%
GPU 0 DPB Strong On                 : 0.00%
GPU 0 DPB Weak On                   : 0.00%
GPU 0 PPFM on                       : 0.00%
GPU 0 Throttle High Priority(%): 0
GPU 0 Throttle NormalHi Priority(%): 0
GPU 0 Throttle Normal Priority(%): 0
GPU 0 Throttle Low Priority(%): 0
GPU 0 Slice switch                  : 0 (0.00/second)
GPU 0 DC6 Exit Reason - Flip: 0 (0.00/second)
GPU 0 DC6 Exit Reason - Register: 0 (0.00/second)
GPU 0 DC6 Exit Reason - Gamma: 0 (0.00/second)
GPU 0 DC6 Exit Reason - Interrupt: 0 (0.00/second)
GPU 0 DC6 Exit Reason - Cursor: 0 (0.00/second)
GPU 0 DC6 Exit Reason - Render: 0 (0.00/second)
GPU 0 [INT] VBLANK_A           : 0 (0.00/second)
GPU 0 [INT] VBLANK_B           : 0 (0.00/second)
GPU 0 [INT] VBLANK_C           : 0 (0.00/second)
GPU 0 [INT] PRIMARY_FLIP_A     : 0 (0.00/second)
GPU 0 [INT] PRIMARY_FLIP_B     : 0 (0.00/second)
GPU 0 [INT] PRIMARY_FLIP_C     : 0 (0.00/second)
GPU 0 [INT] SPRITE_FLIP_A      : 0 (0.00/second)
GPU 0 [INT] SPRITE_FLIP_B      : 0 (0.00/second)
GPU 0 [INT] SPRITE_FLIP_C      : 0 (0.00/second)
GPU 0 [INT] VIDEO_USER_1       : 0 (0.00/second)
GPU 0 [INT] VIDEO_USER_2       : 0 (0.00/second)
GPU 0 [INT] VEBOX_USER         : 0 (0.00/second)
GPU 0 [INT] RENDOR_USER        : 0 (0.00/second)
GPU 0 [INT] BLITTER_USER       : 0 (0.00/second)
GPU 0 [INT] GPU_PARSER         : 0 (0.00/second)
GPU 0 [INT] HOTPLUG_DP_A       : 0 (0.00/second)
GPU 0 [INT] HOTPLUG_DP_B       : 0 (0.00/second)
GPU 0 [INT] HOTPLUG_DP_C       : 0 (0.00/second)
GPU 0 [INT] HOTPLUG_DP_D       : 0 (0.00/second)
GPU 0 [INT] SHORTPULSE_DP_A    : 0 (0.00/second)
GPU 0 [INT] SHORTPULSE_DP_B    : 0 (0.00/second)
GPU 0 [INT] SHORTPULSE_DP_C    : 0 (0.00/second)
GPU 0 [INT] SHORTPULSE_DP_D    : 0 (0.00/second)
GPU 0 [INT] UP_THRESHOLD       : 0 (0.00/second)
GPU 0 [INT] DOWN_THRESHOLD     : 0 (0.00/second)
GPU 0 [INT] SRD_INTERRUPT      : 0 (0.00/second)
GPU 0 [INT] PSR_EXIT_TRIGGER   : 0 (0.00/second)
GPU 0 [INT] BCS_CONTEXT_SWITCH : 0 (0.00/second)
GPU 0 [INT] CS_CONTEXT_SWITCH  : 0 (0.00/second)
GPU 0 [INT] VCS1_CONTEXT_SWITCH: 0 (0.00/second)
GPU 0 [INT] VCS2_CONTEXT_SWITCH: 0 (0.00/second)
GPU 0 [INT] VECS_CONTEXT_SWITCH: 0 (0.00/second)
GPU 0 [INT] PIPEA_UNDERRUN     : 0 (0.00/second)
GPU 0 [INT] PIPEB_UNDERRUN     : 0 (0.00/second)
GPU 0 [INT] PIPEC_UNDERRUN     : 0 (0.00/second)
GPU 0 [INT] RCS_ERROR          : 0 (0.00/second)
GPU 0 [INT] BCS_ERROR          : 0 (0.00/second)
GPU 0 [INT] VCS1_ERROR         : 0 (0.00/second)
GPU 0 [INT] VCS2_ERROR         : 0 (0.00/second)
GPU 0 [INT] VECS_ERROR         : 0 (0.00/second)
GPU 0 [INT] RCS_FLUSH_NOTIFY   : 115 (112.04/second)
GPU 0 [INT] BCS_FLUSH_NOTIFY   : 74 (72.09/second)
GPU 0 [INT] VCS1_FLUSH_NOTIFY  : 0 (0.00/second)
GPU 0 [INT] VCS2_FLUSH_NOTIFY  : 0 (0.00/second)
GPU 0 [INT] VECS_FLUSH_NOTIFY  : 0 (0.00/second)
GPU 0 [INT] GUC_TO_HOST_INT    : 0 (0.00/second)
GPU 0 [INT] SW_INT_6           : 14 (13.64/second)
GPU 0 FB Test Case 0

**** AGPM Stats ****

Accelerator Type: IntelAccelerator
Plimit: 0.00 (GPU Frequency headroom limited on average by 0 MHz)

Accelerator Type: AMDRadeonX4000_AMDBaffinGraphicsAccelerator
Plimit: 0.00 (GPU Frequency headroom N/A)

**** SMC sensors ****

CPU Thermal level: 0
GPU Thermal level: 0
IO Thermal level: 0
Fan: 1994.39 rpm
CPU die temperature: 68.92 C
GPU die temperature: 61.00 C
CPU Plimit: 0.00
GPU Plimit (Int): 0.00 
GPU2 Plimit (Ext1): 0.00 
Number of prochots: 0

**** NVMe Power-state Residency ****
          (null):        0.000 s (  0.0%)
          (null):        0.000 s (  0.0%)
          (null):        0.000 s (  0.0%)
          (null):        1.022 s (100.0%)

**** I/O Throttling ****
Tier0 Throttle Time:        0.000 s (  0.0%)
Tier1 Throttle Time:        0.000 s (  0.0%)
Tier2 Throttle Time:        0.000 s (  0.0%)
Tier3 Throttle Time:        0.000 s (  0.0%)
adelnoureddine commented 10 months ago

Thanks. The line to be read is Intel energy model derived package power (CPUs+GT+SA), contrary to M1/M2 chip which has different naming (as seen in issue #51). I don't have a mac to test a fix for this, but the modification is to be done in the PowermetricsMacOS.java file. If you can modify it to support your output, I'll be happy to accept a PR. macOS support was contributed by @metacosm, so he might be able to provide some insights.

alamers commented 10 months ago

I'll see if I can create a patch.

metacosm commented 10 months ago

I can take care of it, if you haven't started already, @alamers. It is weird that there is so much variability between outputs for different models, specifically for Intel CPUs.

alamers commented 10 months ago

Hi @metacosm, I just created a pull request 55.

It was a bit more involved than I thought, and had to refactor a bit. It seemed that there could be unread lines in the buffered reader left, and that caused it to miss some messages.

I reworked the reader to simply check if there is data with ready(), which should be enough to have intermittent results. I also added some testcases for these scenario's. Since I don't have an M1/M2 chip I used some example data from abhimanbhau.github.io, but maybe it is best to replace that with some other data.

metacosm commented 10 months ago

@alamers Great job! I do have a few comments and suggestions, see the PR review. Note that some of these improvements are available in a modified form in a project I created that simply re-publishes the power information on a REST endpoint for easier downstream consumption (https://github.com/metacosm/power-server), without any of the analysis that JoularJX is performing.

alamers commented 10 months ago

Hi, I hope to find some time next week to look at the suggestions.

alamers commented 10 months ago

Hi @metacosm and @adelnoureddine, I just had a look at the comments.

If I understand correctly, @metacosm is ok with the pull request. My only concern is the test data that I scraped from a website, although it is just output from a tool, maybe better to replace it with content with approval? If so, @metacosm would you be able to provide a dump from machines that you have access to?

The improvements as mentioned, as well as the relation with power-server could be addressed in a separate ticket, since they will require a bit more work. Looking at the code in power-server, if I read correctly, that also takes into account other processes running and assigns energy usage based on relative (c/g)pu usage. That seems a more precise approach?

Also, is there a convenient way of reusing the code from power-server? I don't see a license for power-server so I am unsure how to proceed :) I think it would be perfect if the internal api could be published as a jar, I don't think it is convenient for joularjx to use the REST api. Alternative would be a copy/paste action?

metacosm commented 10 months ago

Hi @metacosm and @adelnoureddine, I just had a look at the comments.

If I understand correctly, @metacosm is ok with the pull request. My only concern is the test data that I scraped from a website, although it is just output from a tool, maybe better to replace it with content with approval? If so, @metacosm would you be able to provide a dump from machines that you have access to?

Yes, I can certainly provide actual data from an M1 mac. Here's one such output: https://github.com/metacosm/power-server/blob/main/server/src/test/resources/sonoma-m1max.txt

The improvements as mentioned, as well as the relation with power-server could be addressed in a separate ticket, since they will require a bit more work. Looking at the code in power-server, if I read correctly, that also takes into account other processes running and assigns energy usage based on relative (c/g)pu usage. That seems a more precise approach?

I don't think there's anything to do in relation to power-server: I simply mentioned it because I implemented there some of the changes I mentioned in the comment (e.g. parsing the header of the powermetrics output to quickly decide which version to handle in the rest of the parsing without having to support all the cases there).

But, yes, power-server can report energy of any pid, not simply the current one, which makes it a little more flexible, imo, but also has the advantage of not influencing the measure as much (since running the power measurement code in the same process as the one you're measuring would impact the measure itself).

Also, is there a convenient way of reusing the code from power-server? I don't see a license for power-server so I am unsure how to proceed :) I think it would be perfect if the internal api could be published as a jar, I don't think it is convenient for joularjx to use the REST api. Alternative would be a copy/paste action?

That's an intriguing question, one I hadn't consider. power-server was born out of a need to get power data out of process but also without any kind of analysis such as the one performed by JoularJX. Is it still in experimental phase and I haven't thought about licensing yet but I'm hesitating between ASL or GPL. Not sure if @adelnoureddine would be interested in reusing that code, though.

adelnoureddine commented 10 months ago

Thanks @metacosm. I'll have to check your code in more detail, but with a quick glance, I think maybe there might be things useful for a future version of JoularJX. Though it seems your tool is closer to PowerJoular, which monitors applications at the PID level rather than source code.

adelnoureddine commented 10 months ago

Thanks @alamers. We can generate testing data and store them to files ourselves. If you both agree (@metacosm too), you can generate such data in your machines with powermetrics and submit them in the PR.

@alamers, there were additional comments by @metacosm as review on your code in the PR. Could you check them when you have some time, and comment/mark resolved (or implement if you want) on them?

metacosm commented 10 months ago

Thanks @metacosm. I'll have to check your code in more detail, but with a quick glance, I think maybe there might be things useful for a future version of JoularJX. Though it seems your tool is closer to PowerJoular, which monitors applications at the PID level rather than source code.

Indeed. I remember looking at it a while ago but then somewhat forgot about it. 😓 That said, I'm really interested in having a solution that would work on all major OSes (even though I haven't looked at Windows support at all, yet) and that's minimally intrusive.

alamers commented 10 months ago

@alamers, there were additional comments by @metacosm as review on your code in the PR. Could you check them when you have some time, and comment/mark resolved (or implement if you want) on them?

I've resolved all of the comments, performance wise it should be on par now again.

One remark remains: the current implementation measures the whole of the system, it does not distribute the energy consumption over all pids. I am unsure if this is the expected behaviour?

To fix that, a lot more is involved, such as parsing the process list.

This is existing behaviour, so unrelated to this ticket, in any case.

metacosm commented 10 months ago

@alamers the per-process attribution is done by the code calling the power measure (using JMX to retrieve the share of CPU consumed by the current process), this is similar to what's done with the RAPL implementation where you don't get any granularity at all (you just get a µJ counter that's periodically checked for monitored system components (CPU, package, etc.)).

alamers commented 10 months ago

@alamers the per-process attribution is done by the code calling the power measure

Ah, I missed that part :)

adelnoureddine commented 10 months ago

Yes, JoularJX gets the CPU energy, then calculate the application's energy (JVM), then each thread, and finally each method and branch. We have an overview description in our documentation site: https://joular.github.io/joularjx/ref/how_it_works.html