Power per container - Githubissues

green-coding-solutions / green-metrics-tool

Measure energy and carbon consumption of software

https://metrics.green-coding.berlin

GNU Affero General Public License v3.0

141 stars 19 forks source link

Power per container #795

Closed ArneTR closed 1 month ago

ArneTR commented 1 month ago

This PR brings a power estimation feature per container to the GMT.

The approach is similar in it's idea on how Scaphandre does it:

We use the CPU Utilization to derive a share of the containers given the total CPU Utilization of the system
Different to Scaphandre we subtract the Idle power and thus only show the "overhead" that the containers have on the host system.

@mrchrisadams - Is that what you were looking for?

@davidkopp - Also would love your feedback on this as we have talked about this before.

A bit more context: Initially the philosphy of the GMT is not to have these values in as they provide limited insights. We have written are more detailed piece on this here:

Since some of you know we have been funded by Mozilla to make a kernel Energy Plugin - https://github.com/orgs/green-kernel/discussions/1 - Say hi in the thread if you like!

Since the plugin is coming soon we wanted to have some comparative functionality already in GMT to see how heaviliy the values will deviate. Still the given caveats apply. We will also add some documenation on this soon.

Love your feedback on this!

Demo measurement

https://metrics.green-coding.io/stats.html?id=f3eaba7c-0338-4c61-932d-fd48475b641c

mrchrisadams commented 1 month ago

Wow @ArneTR that was fast - we only discussed it a few days ago! I'll look into this in detail tomorrow

ArneTR commented 1 month ago

just to give a peek how jittery these values are:

Here are 7 repeated runs on the same code: https://metrics.green-coding.io/compare.html?ids=e1b2173d-54d8-4bb5-9a40-220275ba2e83,165ad69c-cd36-4f33-a1ab-c362612d1571,7315a25d-7b54-4dcd-a1ff-87eeb081df64,b14c7be3-904d-4d7b-8612-02a1d24e7e48,504e126b-5132-40ba-a887-a1ba2420af6c,580e1fe9-a25f-41c5-8ad7-f49209a23b44,a0a0d6c1-9deb-4b7a-a374-ae95189cd4eb,7286b814-e4c6-4399-8f7d-e481f98e9832

Although there is some network-IO making everything a bit unpredictable in the first place the total machine energy only deviates by less than 1%

The containers get up to 30% though!

Reason for that being is that the machine to be used is a represeantative for a user system and thus has DVFS, TurboBoost, HyperThreading turned on. This leads to CPU utilization becoming quite flaky.

See also our case study here: https://www.green-coding.io/case-studies/cpu-utilization-usefulness/

github-actions[bot] commented 1 month ago

Eco-CI Output:	Label	🖥 avg. CPU utilization [%]	🔋 Total Energy [Joules]	🔌 avg. Power [Watts]	Duration [Seconds]
Total Run	22.4451	1515.15	3.45924	446
Measurement #1	22.6363	1515.15	3.45924	438

📈 Energy graph:


 8.18 ┤                                                                                                                                 ╭───╮
 7.54 ┤                                                                                                                                 │   │
 6.90 ┤                                                                                                                               ╭╮│   │
 6.26 ┤                                                                                                               ╭╮        ╭╮    │││   │
 5.62 ┤                                                                                                               ││       ╭╯╰╮   │╰╯   │
 4.97 ┤                                                                         ╭╮                                    │╰╮   ╭╮ │  ╰╮  │     │
 4.33 ┤                    ╭─╮                                                  ││ ╭╮                      ╭╮         │ │╭──╯│ │   │  │     │     ╭╮    ╭╮                                                                                                  ╭╮  ╭─────╮                                                                                                                                                                         ╭╮         ╭
 3.69 ┤    ╭─╮╭───╮╭───────╯ ╰────────────────────────────────────────────────╮ │╰─╯╰──────────────────────╯╰─────────╯ ╰╯   ╰─╯   ╰╮╭╯     │     │╰────╯╰──╮        ╭─╮         ╭╮ ╭───╮         ╭╮         ╭╮ ╭───────────────────────────────────────────╯╰──╯     ╰──╮         ╭──────────╮       ╭──╮       ╭──╮         ╭─────╮  ╭─╮         ╭──────────────────────────────────────────────────╮ ╭───────────────────────────────────────╯╰─────────╯
 3.05 ┤    │ ││   ││                                                          ╰─╯                                                   ││      │    ╭╯         │        │ ╰╮        ││ │   ╰╮        ││         ││ │                                                        │         │          │       │  │       │  │         │     │ ╭╯ │         │                                                  │ │
 2.41 ┤    │ ││   ││                                                                                                                ││      │    │          ╰╮       │  │        ││ │    │       ╭╯│╭╮       │╰╮│                                                        │         │          │       │  │       │  │         │     │ │  │         │                                                  ╰╮│
 1.77 ┼────╯ ╰╯   ╰╯                                                                                                                ╰╯      ╰────╯           ╰───────╯  ╰────────╯╰─╯    ╰───────╯ ╰╯╰───────╯ ╰╯                                                        ╰─────────╯          ╰───────╯  ╰───────╯  ╰─────────╯     ╰─╯  ╰─────────╯                                                   ╰╯
                                                                                                                                                                                                                          Watts over time

🌳 CO2 Data: City: Boydton, Lat: 36.677696, Lon: -78.37471 Carbon Intensity for this location: 362 gCO₂eq/kWh SCI: 0.548484 gCO₂eq / pipeline run emitted

ArneTR commented 1 month ago

Another update on this: I tuned our servers and installed a new machine that is more suitable for this type of benchmarking and are now seeing values which I deem "acceptable".

I got the StdDev for repeated measurements down to ~5%

https://metrics.green-coding.io/compare.html?ids=330dd6da-08f1-4b35-85cd-c5de09137d06,174dbca2-747a-4055-a59a-c173e4957dc2,7e93e046-3c60-4db2-8931-d516f1d09db7,cc963667-46f9-41ed-9cea-82edb82e9fea,6ffeb80a-5691-48f6-b84a-27de80f49c09,d52223fa-c092-4761-8183-95b90a6f98c0,cbe7bef6-eb58-44b7-bde6-8f419f81136b,a3805d33-5e39-4a47-9a23-56c008fcdf17,4d5f1d27-d861-414d-b8d8-b849eaed0862

This was achieved by:

Making the idle phase longer, since the offset is calculated on this phase. The longer the more stable the offset
Booting the kernel with a different frequency driver that allows for fixing the frequency
- Documented machine configurations here: https://docs.green-coding.io/docs/installation/installation-cluster/
- And here: https://docs.green-coding.io/docs/measuring/measurement-cluster/
Turning of TurboBoost, HyperThreading and C-States below C0

The machine is very representative of a classic shared server machine in the cloud now.

To validate I also choose a load test with jMeter that also shows similar values:

https://metrics.green-coding.io/compare.html?ids=2d7f0b5c-5b05-4288-b3d1-ab373e27affa,5740e368-74f4-4371-b13a-5d469e84637c,84ec9ad7-e9c4-49e6-a35d-fbbf9249d228,efa5d3de-3bf3-47d5-a177-e4eecf84e19d,a2ccc511-6350-4f6c-8618-1f7bcf2f3bbb,044b1005-0f20-4f7f-b786-5749e9c074e6,6a0cf413-69c3-4ac5-b41d-11b352c411e1,86dc553e-32d4-4010-ad34-832171330a48

IMHO this PR is now ready to merge. @mrchrisadams let me know if you have any remarks, then I will bring this functionality also to the Github Codespace.

mrchrisadams commented 1 month ago

Excellent news - thanks @ArneTR !

Can I check which machine this is running on now?

I'm assuming it's this one, and that would be helpful for fielding any questions in the workshop on Friday.

CO2 Benchmarking (DVFS OFF, TB OFF, HT OFF) - TX1330 M2 Use Case: For benchmarking of a software where configuration is tuned for reproducability Vendor: Fujitsu TX1330 M2 OS: Ubuntu 24.04 (NOP Linux) Type: Single-Tenant Server CPU: Intel(R) Xeon(R) CPU E3-1240L v5 @ 2.10GHz Cores: 4 Threads: 4 Hyper-Threading: Off Turbo Boost: Off DVFS: Off (Fixed to 2.1 GHz) C-States: C0 only Memory: 8 GB Sample measurement with machine specs Metrics Provider for Machine Power: MCP39F511N

mrchrisadams commented 1 month ago

Otherwise, I can see this PR being useful already, and would find it very helpful to have available. I'd be very happy for it to be merged in.

ArneTR commented 1 month ago

Correct the "CO2 Benchmarking" is the machine to choose. It is activated also on the "CO2 profiling", but values will be not as reliable (just in case you run into congestion because of too many handed in tests :) )

I have further reduced the boot up time of our machines to < 5 Minutes. Once they are up they will pick up tests instantly.

davidkopp commented 1 month ago

Fantastic! Thanks Arne for implementing this 😀

Regarding the implementation, the following thoughts came to my mind:

Container power is only calculated, if machine power is available (either mcp or xgboost is required, right?). So RAPL is not sufficient, a difference to Scaphandre.
Scaphandre calculates the power per process for each jiffy (small time period) and sums it up in the end. Your implementation calculates the power per container in the end using the the total power consumption and the average CPU utilization over the whole time period. I would assume this makes a difference, e.g. because of energy proportionality.
As you have already mentioned at other places, the usage of CPU instructions (like Kepler does it) instead of time / utilization would be preferable. Is the reason, that you don't use CPU instructions because of the complexity and the effort that is needed to implement it?

One question regarding the frontend: Is it planned to add charts to the stats page of a single run that displays the results for container power and container energy?

btw: I also like the new naming and the description of the machines of the measurement cluster. Now it is easier to decide which of the machines is suitable for the measurement I want to run.

ArneTR commented 1 month ago

Fantastic! Thanks Arne for implementing this 😀

Regarding the implementation, the following thoughts came to my mind:

Container power is only calculated, if machine power is available (either mcp or xgboost is required, right?). So RAPL is not sufficient, a difference to Scaphandre.

Correct!

Scaphandre calculates the power per process for each jiffy (small time period) and sums it up in the end. Your implementation calculates the power per container in the end using the the total power consumption and the average CPU utilization over the whole time period. I would assume this makes a difference, e.g. because of energy proportionality.

In theory if you could actually measure it then yes. But since neither Scaphandre nor we account for energy proportionality "integrating" or using the average would get you the same value. If you would chain Cloud Energy with every CPU% value then you would account for the energy proportionality but would have an estimation instead of a linear attributed measurement.

As you have already mentioned at other places, the usage of CPU instructions (like Kepler does it) instead of time / utilization would be preferable. Is the reason, that you don't use CPU instructions because of the complexity and the effort that is needed to implement it?

Jap, that is quite more complex. But doable. We will try to use the Green Kernel Plugin that we create to procure this value then. Happy if you say hello in the repository and also raise questions / wishes there if you have any! https://github.com/green-kernel

One question regarding the frontend: Is it planned to add charts to the stats page of a single run that displays the results for container power and container energy?

Not planned atm. reason being that the value is not the best metric to work with in the first place and for smaller chunks of time it becomes even more bogus. It can give a nice orientation as an average value, but for small time chunks I forsee it not really usable as kernel time tracking here is also not guaranteed to have such high resolutions. Having said that: I am open to bringing it in after a proper analysis, but this will take some time.

btw: I also like the new naming and the description of the machines of the measurement cluster. Now it is easier to decide which of the machines is suitable for the measurement I want to run.

ty!