Open qa-florian-wende opened 7 months ago
I think EKS is still in cgroup v1. In this case, can you add -enable-cgroup-id=false
in kepler? You can run kubectl edit daemonset/kepler-exporter -n kepler
and change the args here
If this works on your end, a kepler-doc PR is welcome!
Thank you for the hint. Unfortunately, this didn't seem to solve the problem just yet:
Once again, after about 90 minutes, the exporters seemed to stop exporting metrics although the pods and containers had no restarts and there were no obvious errors in the log (I saved the logs but I haven't had time yet to look at them more closely).
The dashboard is not well tested. There is another similar case here https://github.com/sustainable-computing-io/kepler/issues/1321
Can you check if the metrics e.g. sum(rate(kepler_container_joules_total[1m]))
on the prometheus UI? It could be somewhere in the grafana and prometheus connection (e.g. token expiration)
It's definitely not just the dashboard, the graph in Prometheus shows exactly the same. What's interesting is that the metric seems to fail one node after the other:
So it appears unlikely that it's just a problem with Prometheus either because there is just one Prometheus instance in my setup. But the Kepler Exporters never crashed, when I look at the /metrics endpoint on any of the Kepler-Exporter pods, I still see the value where the metrics froze, e.g.
# HELP kepler_container_package_joules_total Aggregated value in package value from trained_power_model
# TYPE kepler_container_package_joules_total counter
kepler_container_package_joules_total{container_id="system_processes",container_name="kernel_processes",container_namespace="kernel",mode="dynamic",pod_name="kernel_processes",source="trained_power_model"} 19443.159
When I look at the log of the exporters, I can watch the totals increase every few seconds:
I0415 15:16:09.688498 1 metric_collector.go:116] Collector Update elapsed time: 16.01673ms I0415 15:16:12.674285 1 libbpf_attacher.go:384] successfully get data with batch get and delete with 106 pids in 353.221µs I0415 15:16:12.687665 1 stats.go:251] Unknown node feature: gpu_compute_util, adding 0 value I0415 15:16:12.688033 1 stats.go:251] Unknown node feature: gpu_compute_util, adding 0 value I0415 15:16:12.688142 1 stats.go:251] Unknown node feature: gpu_compute_util, adding 0 value I0415 15:16:12.688244 1 stats.go:251] Unknown node feature: gpu_compute_util, adding 0 value I0415 15:16:12.688327 1 stats.go:251] Unknown node feature: gpu_compute_util, adding 0 value I0415 15:16:12.688433 1 stats.go:251] Unknown node feature: gpu_compute_util, adding 0 value I0415 15:16:12.688513 1 stats.go:251] Unknown node feature: gpu_compute_util, adding 0 value I0415 15:16:12.688648 1 stats.go:251] Unknown node feature: gpu_compute_util, adding 0 value I0415 15:16:12.689356 1 metric_collector.go:337] energy from pod/container: name: karpenter-866b96f7c4-8p8p4/controller namespace: karpenter containerid:202ac0e53a5941a875d812e138ff1cb01bee8d9ed4490359ff99a49a74cf3cb9 cgroupMetrics: map[block_devices_used:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0)] Dyn ePkg (mJ): 0 (0) (eCore: 0 (0) eDram: 0 (0) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (0) Idle ePkg (mJ): 0 (0) (eCore: 0 (0) eDram: 0 (0) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (0) ResUsage: map[block_devices_used:0 (0) bpf_block_irq:0 (0) bpf_cpu_time_ms:0 (0) bpf_net_rx_irq:0 (0) bpf_net_tx_irq:0 (0) bpf_page_cache_hit:0 (0) cache_miss:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0) cpu_cycles:0 (0) cpu_instructions:0 (0) cpu_ref_cycles:0 (0) task_clock_ms:0 (0)] I0415 15:16:12.689628 1 metric_collector.go:337] energy from pod/container: name: kube-prometheus-stack-operator-85f86ff6f6-lddhx/kube-prometheus-stack namespace: monitoring containerid:82102415adffb1fa15f1971f09e3dac0abda82ef84dc34dc78c10176ce350079 cgroupMetrics: map[block_devices_used:0 (0) cgroupfs_cpu_usage_us:8874 (87181948) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:4096 (24403968) cgroupfs_system_cpu_usage_us:6944 (35541810) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:1929 (51640137)] Dyn ePkg (mJ): 0 (702) (eCore: 0 (702) eDram: 0 (198) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (150) Idle ePkg (mJ): 21281310 (21281310) (eCore: 21281310 (21281310) eDram: 14349432 (14349432) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 65031 (65031) ResUsage: map[block_devices_used:0 (0) bpf_block_irq:0 (0) bpf_cpu_time_ms:0 (165) bpf_net_rx_irq:0 (179) bpf_net_tx_irq:0 (0) bpf_page_cache_hit:0 (0) cache_miss:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0) cpu_cycles:0 (0) cpu_instructions:0 (0) cpu_ref_cycles:0 (0) task_clock_ms:0 (10214)] I0415 15:16:12.689861 1 metric_collector.go:337] energy from pod/container: name: kepler-exporter-v6wk2/kepler-exporter namespace: kepler containerid:e9002a5a368e50995dc05ef0bcfdf02539532861e745a1092fda41686c4fc6a9 cgroupMetrics: map[block_devices_used:0 (0) cgroupfs_cpu_usage_us:18761 (166135607) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:12288 (26558464) cgroupfs_system_cpu_usage_us:4041 (33699465) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:14719 (132436141)] Dyn ePkg (mJ): 0 (2430) (eCore: 0 (2430) eDram: 0 (699) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (558) Idle ePkg (mJ): 84753783 (84753783) (eCore: 84753783 (84753783) eDram: 57146124 (57146124) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 261915 (261915) ResUsage: map[block_devices_used:0 (0) bpf_block_irq:0 (0) bpf_cpu_time_ms:0 (571) bpf_net_rx_irq:0 (1052) bpf_net_tx_irq:0 (0) bpf_page_cache_hit:0 (0) cache_miss:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0) cpu_cycles:0 (0) cpu_instructions:0 (0) cpu_ref_cycles:0 (0) task_clock_ms:16 (89998)] I0415 15:16:12.690070 1 metric_collector.go:337] energy from pod/container: name: kube-proxy-4nwp4/kube-proxy namespace: kube-system containerid:c18fe46f676f395271ba5774b282ae48747e24488fef3d2c50ae9d981460eb76 cgroupMetrics: map[block_devices_used:9295 (9295) cgroupfs_cpu_usage_us:335 (24394097) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:38072320 (38072320) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:4096 (17412096) cgroupfs_system_cpu_usage_us:74 (5331825) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:262 (19062272)] Dyn ePkg (mJ): 0 (123) (eCore: 0 (123) eDram: 0 (45) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (36) Idle ePkg (mJ): 21828759 (21828759) (eCore: 21828759 (21828759) eDram: 14718204 (14718204) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 67260 (67260) ResUsage: map[block_devices_used:0 (0) bpf_block_irq:0 (0) bpf_cpu_time_ms:0 (26) bpf_net_rx_irq:0 (129) bpf_net_tx_irq:0 (0) bpf_page_cache_hit:0 (0) cache_miss:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0) cpu_cycles:0 (0) cpu_instructions:0 (0) cpu_ref_cycles:0 (0) task_clock_ms:0 (11364)] I0415 15:16:12.690682 1 metric_collector.go:337] energy from pod/container: name: kube-prometheus-stack-kube-state-metrics-776c898f6-h6lhd/kube-state-metrics namespace: monitoring containerid:9ace1c9bee44202ddf3d5f034edc27b30fdcae9264cfb3daf79fa4e3daf2bb1c cgroupMetrics: map[block_devices_used:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0)] Dyn ePkg (mJ): 0 (0) (eCore: 0 (0) eDram: 0 (0) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (0) Idle ePkg (mJ): 0 (0) (eCore: 0 (0) eDram: 0 (0) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (0) ResUsage: map[block_devices_used:0 (0) bpf_block_irq:0 (0) bpf_cpu_time_ms:0 (0) bpf_net_rx_irq:0 (0) bpf_net_tx_irq:0 (0) bpf_page_cache_hit:0 (0) cache_miss:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0) cpu_cycles:0 (0) cpu_instructions:0 (0) cpu_ref_cycles:0 (0) task_clock_ms:0 (0)] I0415 15:16:12.690900 1 metric_collector.go:337] energy from pod/container: name: prometheus-kube-prometheus-stack-prometheus-0/init-config-reloader namespace: monitoring containerid:ff45721cb08a47bf888d13d817880642bf634223d59149bf976413acbd869b0e cgroupMetrics: map[block_devices_used:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0)] Dyn ePkg (mJ): 0 (0) (eCore: 0 (0) eDram: 0 (0) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (0) Idle ePkg (mJ): 0 (0) (eCore: 0 (0) eDram: 0 (0) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (0) ResUsage: map[block_devices_used:0 (0) bpf_block_irq:0 (0) bpf_cpu_time_ms:0 (0) bpf_net_rx_irq:0 (0) bpf_net_tx_irq:0 (0) bpf_page_cache_hit:0 (0) cache_miss:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0) cpu_cycles:0 (0) cpu_instructions:0 (0) cpu_ref_cycles:0 (0) task_clock_ms:0 (0)] I0415 15:16:12.691631 1 metric_collector.go:337] energy from pod/container: name: kernel_processes/kernel_processes namespace: kernel containerid:system_processes cgroupMetrics: map[block_devices_used:46555 (46555) cgroupfs_cpu_usage_us:736348 (6454240879) cgroupfs_ioread_bytes:7434405647872 (7434405647872) cgroupfs_iowrite_bytes:44259409312256 (44259409312256) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:1675264 (4990038016) cgroupfs_system_cpu_usage_us:126091 (1052383452) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:389111 (3240653554)] Dyn ePkg (mJ): 0 (18733347) (eCore: 0 (18733347) eDram: 0 (3033054) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (46827) Idle ePkg (mJ): 1698068400 (1698068400) (eCore: 1698068400 (1698068400) eDram: 1144955373 (1144955373) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 5246634 (5246634) ResUsage: map[block_devices_used:0 (0) bpf_block_irq:0 (0) bpf_cpu_time_ms:0 (4993158) bpf_net_rx_irq:152 (1406184) bpf_net_tx_irq:0 (0) bpf_page_cache_hit:0 (0) cache_miss:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0) cpu_cycles:0 (0) cpu_instructions:0 (0) cpu_ref_cycles:0 (0) task_clock_ms:7002 (43637854)] I0415 15:16:12.691867 1 metric_collector.go:337] energy from pod/container: name: aws-node-7l5wr/aws-eks-nodeagent namespace: kube-system containerid:02af5de4b46289519bc03e37c3b7f3539429abc0bc922fb13487739646b5b01a cgroupMetrics: map[block_devices_used:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0)] Dyn ePkg (mJ): 0 (0) (eCore: 0 (0) eDram: 0 (0) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (0) Idle ePkg (mJ): 0 (0) (eCore: 0 (0) eDram: 0 (0) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (0) ResUsage: map[block_devices_used:0 (0) bpf_block_irq:0 (0) bpf_cpu_time_ms:0 (0) bpf_net_rx_irq:0 (0) bpf_net_tx_irq:0 (0) bpf_page_cache_hit:0 (0) cache_miss:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0) cpu_cycles:0 (0) cpu_instructions:0 (0) cpu_ref_cycles:0 (0) task_clock_ms:0 (0)] I0415 15:16:12.692025 1 metric_collector.go:337] energy from pod/container: name: aws-node-7l5wr/aws-vpc-cni-init namespace: kube-system containerid:3ccd8483bbe890b912b99591d0fcae3cacbf8ed6b25e1e89fad0ef6245d41eea cgroupMetrics: map[block_devices_used:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0)] Dyn ePkg (mJ): 0 (0) (eCore: 0 (0) eDram: 0 (0) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (0) Idle ePkg (mJ): 0 (0) (eCore: 0 (0) eDram: 0 (0) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (0) ResUsage: map[block_devices_used:0 (0) bpf_block_irq:0 (0) bpf_cpu_time_ms:0 (0) bpf_net_rx_irq:0 (0) bpf_net_tx_irq:0 (0) bpf_page_cache_hit:0 (0) cache_miss:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0) cpu_cycles:0 (0) cpu_instructions:0 (0) cpu_ref_cycles:0 (0) task_clock_ms:0 (0)] I0415 15:16:12.692218 1 metric_collector.go:337] energy from pod/container: name: kube-prometheus-stack-prometheus-node-exporter-nxdfd/node-exporter namespace: monitoring containerid:cbc2c3372e4a251cb177ef2a962d6ff700d8639f65d5435ed49f2657c20a8d2a cgroupMetrics: map[block_devices_used:0 (0) cgroupfs_cpu_usage_us:549 (74927878) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:73728 (9990144) cgroupfs_system_cpu_usage_us:187 (25509996) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:362 (49417882)] Dyn ePkg (mJ): 0 (870) (eCore: 0 (870) eDram: 0 (198) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (102) Idle ePkg (mJ): 11789583 (11789583) (eCore: 11789583 (11789583) eDram: 7949535 (7949535) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 35727 (35727) ResUsage: map[block_devices_used:0 (0) bpf_block_irq:0 (0) bpf_cpu_time_ms:0 (221) bpf_net_rx_irq:0 (639) bpf_net_tx_irq:0 (0) bpf_page_cache_hit:0 (0) cache_miss:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0) cpu_cycles:0 (0) cpu_instructions:0 (0) cpu_ref_cycles:0 (0) task_clock_ms:0 (56154)] I0415 15:16:12.692721 1 metric_collector.go:337] energy from pod/container: name: prometheus-kube-prometheus-stack-prometheus-0/prometheus namespace: monitoring containerid:0047ecf760f8f4059df6a641d25ce75884583c7ab8a6822e1fcb75cd1bf6ef59 cgroupMetrics: map[block_devices_used:9311 (9311) cgroupfs_cpu_usage_us:204459 (1676929670) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:5817731665920 (5817731665920) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:270336 (884203520) cgroupfs_system_cpu_usage_us:250 (87082106) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:204459 (1589847564)] Dyn ePkg (mJ): 0 (64317) (eCore: 0 (64317) eDram: 0 (11634) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (2439) Idle ePkg (mJ): 108110547 (108110547) (eCore: 108110547 (108110547) eDram: 72894954 (72894954) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 334254 (334254) ResUsage: map[block_devices_used:0 (0) bpf_block_irq:0 (0) bpf_cpu_time_ms:0 (16859) bpf_net_rx_irq:7 (13128) bpf_net_tx_irq:0 (0) bpf_page_cache_hit:0 (0) cache_miss:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0) cpu_cycles:0 (0) cpu_instructions:0 (0) cpu_ref_cycles:0 (0) task_clock_ms:398 (737160)] I0415 15:16:12.693045 1 metric_collector.go:337] energy from pod/container: name: aws-node-7l5wr/aws-node namespace: kube-system containerid:93fd849565fe577d7121dcc10b8c35190282d836e5a0ae17243ed739fb02d91b cgroupMetrics: map[block_devices_used:12116 (12116) cgroupfs_cpu_usage_us:141 (88045539) cgroupfs_ioread_bytes:49627136 (49627136) cgroupfs_iowrite_bytes:34051883008 (34051883008) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:40960 (32296960) cgroupfs_system_cpu_usage_us:76 (46999925) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:65 (41045613)] Dyn ePkg (mJ): 0 (0) (eCore: 0 (0) eDram: 0 (0) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (0) Idle ePkg (mJ): 4113 (4113) (eCore: 4113 (4113) eDram: 2775 (2775) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 12 (12) ResUsage: map[block_devices_used:0 (0) bpf_block_irq:0 (0) bpf_cpu_time_ms:0 (0) bpf_net_rx_irq:0 (0) bpf_net_tx_irq:0 (0) bpf_page_cache_hit:0 (0) cache_miss:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0) cpu_cycles:0 (0) cpu_instructions:0 (0) cpu_ref_cycles:0 (0) task_clock_ms:0 (0)] I0415 15:16:12.693231 1 metric_collector.go:337] energy from pod/container: name: prometheus-kube-prometheus-stack-prometheus-0/config-reloader namespace: monitoring containerid:deaae1b28894565849f999b96fc58c8e744a58f7b51af38d6e0ca353251fd4f8 cgroupMetrics: map[block_devices_used:0 (0) cgroupfs_cpu_usage_us:1923 (7392973) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:36864 (15806464) cgroupfs_system_cpu_usage_us:336 (1509533) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:1923 (5883440)] Dyn ePkg (mJ): 0 (0) (eCore: 0 (0) eDram: 0 (0) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (0) Idle ePkg (mJ): 2487723 (2487723) (eCore: 2487723 (2487723) eDram: 1677396 (1677396) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 7512 (7512) ResUsage: map[block_devices_used:0 (0) bpf_block_irq:0 (0) bpf_cpu_time_ms:0 (0) bpf_net_rx_irq:0 (18) bpf_net_tx_irq:0 (0) bpf_page_cache_hit:0 (0) cache_miss:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0) cpu_cycles:0 (0) cpu_instructions:0 (0) cpu_ref_cycles:0 (0) task_clock_ms:0 (0)] I0415 15:16:12.693396 1 metric_collector.go:337] energy from pod/container: name: rust-hoverfly-d988dcfc6-wm888/rust-hoverfly namespace: rust containerid:5a25115b0739fd175a6feb3200c759e8af6d7eac9234a3770ccf0519f5c1a8b2 cgroupMetrics: map[block_devices_used:0 (0) cgroupfs_cpu_usage_us:847 (74616376) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:4096 (30597120) cgroupfs_system_cpu_usage_us:188 (16545484) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:660 (58070892)] Dyn ePkg (mJ): 0 (6132) (eCore: 0 (6132) eDram: 0 (1137) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (306) Idle ePkg (mJ): 4820538 (4820538) (eCore: 4820538 (4820538) eDram: 3250314 (3250314) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 14826 (14826) ResUsage: map[block_devices_used:0 (0) bpf_block_irq:0 (0) bpf_cpu_time_ms:0 (1605) bpf_net_rx_irq:0 (793) bpf_net_tx_irq:0 (0) bpf_page_cache_hit:0 (0) cache_miss:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0) cpu_cycles:0 (0) cpu_instructions:0 (0) cpu_ref_cycles:0 (0) task_clock_ms:0 (30600)] I0415 15:16:12.693639 1 metric_collector.go:340] node energy (mJ): Dyn ePkg (mJ): 0 (18794913) (eCore: 0 (18794913) eDram: 0 (3030690) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (26274) Idle ePkg (mJ): 209751 (1952991561) (eCore: 209751 (1952991561) eDram: 141426 (1316817486) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 660 (6145260) ResUsage: map[block_devices_used:0 (0) bpf_block_irq:0 (0) bpf_cpu_time_ms:0 (5012624) bpf_net_rx_irq:159 (1422126) bpf_net_tx_irq:0 (0) bpf_page_cache_hit:0 (0) cache_miss:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0) cpu_cycles:0 (0) cpu_instructions:0 (0) cpu_ref_cycles:0 (0) task_clock_ms:7416 (44573398)] I0415 15:16:12.694072 1 metric_collector.go:116] Collector Update elapsed time: 20.702195ms I0415 15:16:15.696221 1 stats.go:251] Unknown node feature: gpu_compute_util, adding 0 value I0415 15:16:15.696270 1 stats.go:251] Unknown node feature: gpu_compute_util, adding 0 value I0415 15:16:15.696318 1 stats.go:251] Unknown node feature: gpu_compute_util, adding 0 value I0415 15:16:15.696377 1 stats.go:251] Unknown node feature: gpu_compute_util, adding 0 value I0415 15:16:15.696420 1 stats.go:251] Unknown node feature: gpu_compute_util, adding 0 value I0415 15:16:15.696468 1 stats.go:251] Unknown node feature: gpu_compute_util, adding 0 value I0415 15:16:15.696509 1 stats.go:251] Unknown node feature: gpu_compute_util, adding 0 value I0415 15:16:15.696586 1 stats.go:251] Unknown node feature: gpu_compute_util, adding 0 value I0415 15:16:15.696661 1 stats.go:251] Unknown node feature: gpu_compute_util, adding 0 value I0415 15:16:15.696975 1 metric_collector.go:337] energy from pod/container: name: aws-node-7l5wr/aws-eks-nodeagent namespace: kube-system containerid:02af5de4b46289519bc03e37c3b7f3539429abc0bc922fb13487739646b5b01a cgroupMetrics: map[block_devices_used:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0)] Dyn ePkg (mJ): 0 (0) (eCore: 0 (0) eDram: 0 (0) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (0) Idle ePkg (mJ): 0 (0) (eCore: 0 (0) eDram: 0 (0) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (0) ResUsage: map[block_devices_used:0 (0) bpf_block_irq:0 (0) bpf_cpu_time_ms:0 (0) bpf_net_rx_irq:0 (0) bpf_net_tx_irq:0 (0) bpf_page_cache_hit:0 (0) cache_miss:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0) cpu_cycles:0 (0) cpu_instructions:0 (0) cpu_ref_cycles:0 (0) task_clock_ms:0 (0)] I0415 15:16:15.697498 1 metric_collector.go:337] energy from pod/container: name: aws-node-7l5wr/aws-vpc-cni-init namespace: kube-system containerid:3ccd8483bbe890b912b99591d0fcae3cacbf8ed6b25e1e89fad0ef6245d41eea cgroupMetrics: map[block_devices_used:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0)] Dyn ePkg (mJ): 0 (0) (eCore: 0 (0) eDram: 0 (0) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (0) Idle ePkg (mJ): 0 (0) (eCore: 0 (0) eDram: 0 (0) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (0) ResUsage: map[block_devices_used:0 (0) bpf_block_irq:0 (0) bpf_cpu_time_ms:0 (0) bpf_net_rx_irq:0 (0) bpf_net_tx_irq:0 (0) bpf_page_cache_hit:0 (0) cache_miss:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0) cpu_cycles:0 (0) cpu_instructions:0 (0) cpu_ref_cycles:0 (0) task_clock_ms:0 (0)] I0415 15:16:15.698249 1 metric_collector.go:337] energy from pod/container: name: kube-prometheus-stack-prometheus-node-exporter-nxdfd/node-exporter namespace: monitoring containerid:cbc2c3372e4a251cb177ef2a962d6ff700d8639f65d5435ed49f2657c20a8d2a cgroupMetrics: map[block_devices_used:0 (0) cgroupfs_cpu_usage_us:549 (74927878) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:73728 (9990144) cgroupfs_system_cpu_usage_us:187 (25509996) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:362 (49417882)] Dyn ePkg (mJ): 0 (870) (eCore: 0 (870) eDram: 0 (198) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (102) Idle ePkg (mJ): 11789583 (11789583) (eCore: 11789583 (11789583) eDram: 7949535 (7949535) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 35727 (35727) ResUsage: map[block_devices_used:0 (0) bpf_block_irq:0 (0) bpf_cpu_time_ms:0 (221) bpf_net_rx_irq:0 (639) bpf_net_tx_irq:0 (0) bpf_page_cache_hit:0 (0) cache_miss:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0) cpu_cycles:0 (0) cpu_instructions:0 (0) cpu_ref_cycles:0 (0) task_clock_ms:0 (56154)] I0415 15:16:15.699552 1 metric_collector.go:337] energy from pod/container: name: prometheus-kube-prometheus-stack-prometheus-0/prometheus namespace: monitoring containerid:0047ecf760f8f4059df6a641d25ce75884583c7ab8a6822e1fcb75cd1bf6ef59 cgroupMetrics: map[block_devices_used:9312 (9312) cgroupfs_cpu_usage_us:32658 (1676962328) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:5819054809088 (5819054809088) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:20480 (884224000) cgroupfs_system_cpu_usage_us:250 (87082106) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:32658 (1589880222)] Dyn ePkg (mJ): 0 (64317) (eCore: 0 (64317) eDram: 0 (11634) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (2439) Idle ePkg (mJ): 108124533 (108124533) (eCore: 108124533 (108124533) eDram: 72904383 (72904383) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 334296 (334296) ResUsage: map[block_devices_used:0 (0) bpf_block_irq:0 (0) bpf_cpu_time_ms:0 (16859) bpf_net_rx_irq:0 (13128) bpf_net_tx_irq:0 (0) bpf_page_cache_hit:0 (0) cache_miss:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0) cpu_cycles:0 (0) cpu_instructions:0 (0) cpu_ref_cycles:0 (0) task_clock_ms:46 (737206)] I0415 15:16:15.699742 1 metric_collector.go:337] energy from pod/container: name: aws-node-7l5wr/aws-node namespace: kube-system containerid:93fd849565fe577d7121dcc10b8c35190282d836e5a0ae17243ed739fb02d91b cgroupMetrics: map[block_devices_used:12119 (12119) cgroupfs_cpu_usage_us:13882 (88059421) cgroupfs_ioread_bytes:49639424 (49639424) cgroupfs_iowrite_bytes:34058948608 (34058948608) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:40960 (32296960) cgroupfs_system_cpu_usage_us:2922 (47002847) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:10960 (41056573)] Dyn ePkg (mJ): 0 (0) (eCore: 0 (0) eDram: 0 (0) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (0) Idle ePkg (mJ): 4113 (4113) (eCore: 4113 (4113) eDram: 2775 (2775) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 12 (12) ResUsage: map[block_devices_used:0 (0) bpf_block_irq:0 (0) bpf_cpu_time_ms:0 (0) bpf_net_rx_irq:0 (0) bpf_net_tx_irq:0 (0) bpf_page_cache_hit:0 (0) cache_miss:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0) cpu_cycles:0 (0) cpu_instructions:0 (0) cpu_ref_cycles:0 (0) task_clock_ms:0 (0)] I0415 15:16:15.700031 1 metric_collector.go:337] energy from pod/container: name: prometheus-kube-prometheus-stack-prometheus-0/config-reloader namespace: monitoring containerid:deaae1b28894565849f999b96fc58c8e744a58f7b51af38d6e0ca353251fd4f8 cgroupMetrics: map[block_devices_used:0 (0) cgroupfs_cpu_usage_us:1923 (7392973) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:36864 (15806464) cgroupfs_system_cpu_usage_us:336 (1509533) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:1923 (5883440)] Dyn ePkg (mJ): 0 (0) (eCore: 0 (0) eDram: 0 (0) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (0) Idle ePkg (mJ): 2487723 (2487723) (eCore: 2487723 (2487723) eDram: 1677396 (1677396) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 7512 (7512) ResUsage: map[block_devices_used:0 (0) bpf_block_irq:0 (0) bpf_cpu_time_ms:0 (0) bpf_net_rx_irq:0 (18) bpf_net_tx_irq:0 (0) bpf_page_cache_hit:0 (0) cache_miss:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0) cpu_cycles:0 (0) cpu_instructions:0 (0) cpu_ref_cycles:0 (0) task_clock_ms:0 (0)] I0415 15:16:15.700253 1 metric_collector.go:337] energy from pod/container: name: rust-hoverfly-d988dcfc6-wm888/rust-hoverfly namespace: rust containerid:5a25115b0739fd175a6feb3200c759e8af6d7eac9234a3770ccf0519f5c1a8b2 cgroupMetrics: map[block_devices_used:0 (0) cgroupfs_cpu_usage_us:784 (74617160) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:4096 (30597120) cgroupfs_system_cpu_usage_us:174 (16545658) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:610 (58071502)] Dyn ePkg (mJ): 0 (6132) (eCore: 0 (6132) eDram: 0 (1137) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (306) Idle ePkg (mJ): 4820538 (4820538) (eCore: 4820538 (4820538) eDram: 3250314 (3250314) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 14826 (14826) ResUsage: map[block_devices_used:0 (0) bpf_block_irq:0 (0) bpf_cpu_time_ms:0 (1605) bpf_net_rx_irq:0 (793) bpf_net_tx_irq:0 (0) bpf_page_cache_hit:0 (0) cache_miss:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0) cpu_cycles:0 (0) cpu_instructions:0 (0) cpu_ref_cycles:0 (0) task_clock_ms:0 (30600)] I0415 15:16:15.701063 1 metric_collector.go:337] energy from pod/container: name: karpenter-866b96f7c4-8p8p4/controller namespace: karpenter containerid:202ac0e53a5941a875d812e138ff1cb01bee8d9ed4490359ff99a49a74cf3cb9 cgroupMetrics: map[block_devices_used:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0)] Dyn ePkg (mJ): 0 (0) (eCore: 0 (0) eDram: 0 (0) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (0) Idle ePkg (mJ): 0 (0) (eCore: 0 (0) eDram: 0 (0) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (0) ResUsage: map[block_devices_used:0 (0) bpf_block_irq:0 (0) bpf_cpu_time_ms:0 (0) bpf_net_rx_irq:0 (0) bpf_net_tx_irq:0 (0) bpf_page_cache_hit:0 (0) cache_miss:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0) cpu_cycles:0 (0) cpu_instructions:0 (0) cpu_ref_cycles:0 (0) task_clock_ms:0 (0)] I0415 15:16:15.701774 1 metric_collector.go:337] energy from pod/container: name: kube-prometheus-stack-operator-85f86ff6f6-lddhx/kube-prometheus-stack namespace: monitoring containerid:82102415adffb1fa15f1971f09e3dac0abda82ef84dc34dc78c10176ce350079 cgroupMetrics: map[block_devices_used:0 (0) cgroupfs_cpu_usage_us:7635 (87189583) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:4096 (24408064) cgroupfs_system_cpu_usage_us:7635 (35549445) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:1929 (51640137)] Dyn ePkg (mJ): 0 (702) (eCore: 0 (702) eDram: 0 (198) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (150) Idle ePkg (mJ): 21281310 (21281310) (eCore: 21281310 (21281310) eDram: 14349432 (14349432) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 65031 (65031) ResUsage: map[block_devices_used:0 (0) bpf_block_irq:0 (0) bpf_cpu_time_ms:0 (165) bpf_net_rx_irq:0 (179) bpf_net_tx_irq:0 (0) bpf_page_cache_hit:0 (0) cache_miss:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0) cpu_cycles:0 (0) cpu_instructions:0 (0) cpu_ref_cycles:0 (0) task_clock_ms:0 (10214)] I0415 15:16:15.702051 1 metric_collector.go:337] energy from pod/container: name: kepler-exporter-v6wk2/kepler-exporter namespace: kepler containerid:e9002a5a368e50995dc05ef0bcfdf02539532861e745a1092fda41686c4fc6a9 cgroupMetrics: map[block_devices_used:0 (0) cgroupfs_cpu_usage_us:19182 (166154789) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:622592 (27181056) cgroupfs_system_cpu_usage_us:8218 (33707683) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:10964 (132447105)] Dyn ePkg (mJ): 0 (2430) (eCore: 0 (2430) eDram: 0 (699) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (558) Idle ePkg (mJ): 84753783 (84753783) (eCore: 84753783 (84753783) eDram: 57146124 (57146124) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 261915 (261915) ResUsage: map[block_devices_used:0 (0) bpf_block_irq:0 (0) bpf_cpu_time_ms:0 (571) bpf_net_rx_irq:0 (1052) bpf_net_tx_irq:0 (0) bpf_page_cache_hit:0 (0) cache_miss:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0) cpu_cycles:0 (0) cpu_instructions:0 (0) cpu_ref_cycles:0 (0) task_clock_ms:0 (89998)] I0415 15:16:15.703493 1 metric_collector.go:337] energy from pod/container: name: kube-proxy-4nwp4/kube-proxy namespace: kube-system containerid:c18fe46f676f395271ba5774b282ae48747e24488fef3d2c50ae9d981460eb76 cgroupMetrics: map[block_devices_used:9296 (9296) cgroupfs_cpu_usage_us:146 (24394243) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:38076416 (38076416) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:4096 (17412096) cgroupfs_system_cpu_usage_us:32 (5331857) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:114 (19062386)] Dyn ePkg (mJ): 0 (123) (eCore: 0 (123) eDram: 0 (45) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (36) Idle ePkg (mJ): 21828759 (21828759) (eCore: 21828759 (21828759) eDram: 14718204 (14718204) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 67260 (67260) ResUsage: map[block_devices_used:0 (0) bpf_block_irq:0 (0) bpf_cpu_time_ms:0 (26) bpf_net_rx_irq:0 (129) bpf_net_tx_irq:0 (0) bpf_page_cache_hit:0 (0) cache_miss:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0) cpu_cycles:0 (0) cpu_instructions:0 (0) cpu_ref_cycles:0 (0) task_clock_ms:0 (11364)] I0415 15:16:15.704722 1 metric_collector.go:337] energy from pod/container: name: kube-prometheus-stack-kube-state-metrics-776c898f6-h6lhd/kube-state-metrics namespace: monitoring containerid:9ace1c9bee44202ddf3d5f034edc27b30fdcae9264cfb3daf79fa4e3daf2bb1c cgroupMetrics: map[block_devices_used:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0)] Dyn ePkg (mJ): 0 (0) (eCore: 0 (0) eDram: 0 (0) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (0) Idle ePkg (mJ): 0 (0) (eCore: 0 (0) eDram: 0 (0) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (0) ResUsage: map[block_devices_used:0 (0) bpf_block_irq:0 (0) bpf_cpu_time_ms:0 (0) bpf_net_rx_irq:0 (0) bpf_net_tx_irq:0 (0) bpf_page_cache_hit:0 (0) cache_miss:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0) cpu_cycles:0 (0) cpu_instructions:0 (0) cpu_ref_cycles:0 (0) task_clock_ms:0 (0)] I0415 15:16:15.704959 1 metric_collector.go:337] energy from pod/container: name: prometheus-kube-prometheus-stack-prometheus-0/init-config-reloader namespace: monitoring containerid:ff45721cb08a47bf888d13d817880642bf634223d59149bf976413acbd869b0e cgroupMetrics: map[block_devices_used:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0)] Dyn ePkg (mJ): 0 (0) (eCore: 0 (0) eDram: 0 (0) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (0) Idle ePkg (mJ): 0 (0) (eCore: 0 (0) eDram: 0 (0) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (0) ResUsage: map[block_devices_used:0 (0) bpf_block_irq:0 (0) bpf_cpu_time_ms:0 (0) bpf_net_rx_irq:0 (0) bpf_net_tx_irq:0 (0) bpf_page_cache_hit:0 (0) cache_miss:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0) cpu_cycles:0 (0) cpu_instructions:0 (0) cpu_ref_cycles:0 (0) task_clock_ms:0 (0)] I0415 15:16:15.705932 1 metric_collector.go:337] energy from pod/container: name: kernel_processes/kernel_processes namespace: kernel containerid:system_processes cgroupMetrics: map[block_devices_used:46560 (46560) cgroupfs_cpu_usage_us:407047 (6454647926) cgroupfs_ioread_bytes:7435207573504 (7435207573504) cgroupfs_iowrite_bytes:44265488673280 (44265488673280) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:344064 (4990382080) cgroupfs_system_cpu_usage_us:50884 (1052434336) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:185785 (3240839339)] Dyn ePkg (mJ): 0 (18733347) (eCore: 0 (18733347) eDram: 0 (3033054) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (46827) Idle ePkg (mJ): 1698264204 (1698264204) (eCore: 1698264204 (1698264204) eDram: 1145087379 (1145087379) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 5247222 (5247222) ResUsage: map[block_devices_used:0 (0) bpf_block_irq:0 (0) bpf_cpu_time_ms:0 (4993158) bpf_net_rx_irq:157 (1406341) bpf_net_tx_irq:0 (0) bpf_page_cache_hit:0 (0) cache_miss:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0) cpu_cycles:0 (0) cpu_instructions:0 (0) cpu_ref_cycles:0 (0) task_clock_ms:5710 (43643564)] I0415 15:16:15.706336 1 metric_collector.go:340] node energy (mJ): Dyn ePkg (mJ): 0 (18794913) (eCore: 0 (18794913) eDram: 0 (3030690) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 0 (26274) Idle ePkg (mJ): 209751 (1953201312) (eCore: 209751 (1953201312) eDram: 141426 (1316958912) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) platform (mJ): 660 (6145920) ResUsage: map[block_devices_used:0 (0) bpf_block_irq:0 (0) bpf_cpu_time_ms:0 (5012624) bpf_net_rx_irq:157 (1422283) bpf_net_tx_irq:0 (0) bpf_page_cache_hit:0 (0) cache_miss:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0) cpu_cycles:0 (0) cpu_instructions:0 (0) cpu_ref_cycles:0 (0) task_clock_ms:5756 (44579154)] I0415 15:16:15.707477 1 metric_collector.go:116] Collector Update elapsed time: 34.273537ms
What happened?
We were testing different deployments in an AWS EKS cluster to monitor which uses how much energy. Although we could see individual pods and containers in the Kepler Dashboard, all values were zero except for those of the system processes.
We also checked the metrics endpoint of the Kepler exporters directly, so it isn't just an issue with the Kepler Dashboard. We ran some loadtests on our deployments, so it isn't just a problem with values rounded down to zero. We repeated our tests with various AWS EC2 instance families, instance sizes and Kubernetes Versions (using eksctl to deploy the cluster) and whenever one combination of the parameters worked, when we re-deployed the cluster with the same parameters it didn't work anymore (once it even stopped working while the cluster continued to run).
What did you expect to happen?
I expected to see measurements all the time.
How can we reproduce it (as minimally and precisely as possible)?
Unfortunately, at this moment, we are not able to reproduce the bug consistently. The setup which ceased to work while the cluster was running was with the Bottlerocket ami-family (which seems to be the only Linux distribution which is supported by eksctl and supports cgroup v2) on t3.large instances in eu-north-1 with Kubernetes Version 1.28.
Anything else we need to know?
No response
Kepler image tag
Kubernetes version
We witnessed the behaviour on 1.24, 1.28 and 1.29.
Cloud provider or bare metal
AWS
OS version
Install tools
eksctl and helm
Kepler deployment config
Container runtime (CRI) and version (if applicable)
No response
Related plugins (CNI, CSI, ...) and versions (if applicable)
No response