Kepler (Kubernetes-based Efficient Power Level Exporter) uses eBPF to probe performance counters and other system stats, use ML models to estimate workload energy consumption based on these stats, and exports them as Prometheus metrics
When Kepler latest is deployed along with the estimator and model-server(release-0.7.11) on the VM, Kepler is unable to unmarshal array while creating Power Model for Platform and Component power.
Below are the logs from Kepler for reference:
kepler-1 | I0802 07:49:04.992245 36250 model.go:95] Using Power Model Ratio
kepler-1 | I0802 07:49:04.992250 36250 process_energy.go:124] Using the Ratio/DynPower Power Model to estimate Process Component Power
kepler-1 | I0802 07:49:04.992256 36250 process_energy.go:125] Process feature names: [bpf_cpu_time_ms bpf_cpu_time_ms bpf_cpu_time_ms gpu_compute_util]
kepler-1 | I0802 07:49:04.992264 36250 model.go:178] Model Config NODE_TOTAL: {ModelType:EstimatorSidecar ModelOutputType:AbsPower TrainerName:SGDRegressorTrainer EnergySou
rce:acpi SelectFilter: InitModelURL:https://raw.githubusercontent.com/sustainable-computing-io/kepler-model-db/main/models/v0.7/specpower/acpi/AbsPower/BPFOnly/GradientBoostin
gRegressorTrainer_0.zip InitModelFilepath: IsNodePowerModel:true ProcessFeatureNames:[] NodeFeatureNames:[] SystemMetaDataFeatureNames:[] SystemMetaDataFeatureValues:[]}
kepler-1 | I0802 07:49:05.993525 36250 estimate.go:139] estimator unmarshal error: json: cannot unmarshal array into Go struct field ComponentPowerResponse.powers of type m
ap[string][]float64 ({"powers": [], "msg": "'NoneType' object has no attribute 'predict'\n"})
kepler-1 | I0802 07:49:05.993905 36250 node_platform_energy.go:54] Failed to create EstimatorSidecar/AbsPower Power Model to estimate Node Platform Power: json: cannot unma
rshal array into Go struct field ComponentPowerResponse.powers of type map[string][]float64
kepler-1 | I0802 07:49:05.993944 36250 model.go:178] Model Config NODE_COMPONENTS: {ModelType:EstimatorSidecar ModelOutputType:AbsPower TrainerName:SGDRegressorTrainer Ener
gySource:intel_rapl SelectFilter: InitModelURL:https://raw.githubusercontent.com/sustainable-computing-io/kepler-model-db/main/models/v0.7/ec2-0.7.11/rapl-sysfs/AbsPower/BPFOn
ly/GradientBoostingRegressorTrainer_0.zip InitModelFilepath: IsNodePowerModel:true ProcessFeatureNames:[] NodeFeatureNames:[] SystemMetaDataFeatureNames:[] SystemMetaDataFeatu
reValues:[]}
kepler-1 | I0802 07:49:06.384985 36250 estimate.go:139] estimator unmarshal error: json: cannot unmarshal array into Go struct field ComponentPowerResponse.powers of type m
ap[string][]float64 ({"powers": [], "msg": "'NoneType' object has no attribute 'predict'\n"})
kepler-1 | I0802 07:49:06.385064 36250 node_component_energy.go:58] Failed to create EstimatorSidecar/AbsPower Power Model to estimate Node Component Power: json: cannot un
marshal array into Go struct field ComponentPowerResponse.powers of type map[string][]float64
What did you expect to happen?
Kepler should be able to use the latest models to estimate Platform and Component power on the VM
How can we reproduce it (as minimally and precisely as possible)?
Deploy Kepler on VM using vm compose manifests with following updations:
Changes to compose.yaml
Change the python version from python3.8 to python3.10 for model-server and estimator
Bump up the model-server image to quay.io/sustainable_computing_io/kepler_model_server:v0.7.11 for model-server and estimator
Update the NODE_COMPONENTS_INIT_URL in vm/kepler/etc/kepler/kepler.config/MODEL_CONFIG to point to latest ec2-0.7.11 URL: https://raw.githubusercontent.com/sustainable-computing-io/kepler-model-db/main/models/v0.7/ec2-0.7.11/rapl-sysfs/AbsPower/BPFOnly/GradientBoostingRegressorTrainer_0.zip
Anything else we need to know?
No response
Kepler image tag
latest
Kubernetes version
```console
$ kubectl version
# paste output here
```
Cloud provider or bare metal
VM
OS version
```console
# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here
# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here
```
Install tools
Kepler deployment config
For on kubernetes:
```console
$ KEPLER_NAMESPACE=kepler
# provide kepler configmap
$ kubectl get configmap kepler-cfm -n ${KEPLER_NAMESPACE}
# paste output here
# provide kepler deployment description
$ kubectl describe deployment kepler-exporter -n ${KEPLER_NAMESPACE}
```
For standalone:
# put your Kepler command argument here
Container runtime (CRI) and version (if applicable)
Related plugins (CNI, CSI, ...) and versions (if applicable)
What happened?
When Kepler latest is deployed along with the estimator and model-server(release-0.7.11) on the VM, Kepler is unable to unmarshal array while creating Power Model for Platform and Component power.
Below are the logs from Kepler for reference:
What did you expect to happen?
Kepler should be able to use the latest models to estimate Platform and Component power on the VM
How can we reproduce it (as minimally and precisely as possible)?
Deploy Kepler on VM using vm compose manifests with following updations:
python3.8
topython3.10
for model-server and estimatorquay.io/sustainable_computing_io/kepler_model_server:v0.7.11
for model-server and estimatorNODE_COMPONENTS_INIT_URL
invm/kepler/etc/kepler/kepler.config/MODEL_CONFIG
to point to latest ec2-0.7.11 URL:https://raw.githubusercontent.com/sustainable-computing-io/kepler-model-db/main/models/v0.7/ec2-0.7.11/rapl-sysfs/AbsPower/BPFOnly/GradientBoostingRegressorTrainer_0.zip
Anything else we need to know?
No response
Kepler image tag
Kubernetes version
Cloud provider or bare metal
OS version
Install tools
Kepler deployment config
Container runtime (CRI) and version (if applicable)
Related plugins (CNI, CSI, ...) and versions (if applicable)