Closed ArneTR closed 1 month ago
Wow @ArneTR that was fast - we only discussed it a few days ago! I'll look into this in detail tomorrow
just to give a peek how jittery these values are:
Although there is some network-IO making everything a bit unpredictable in the first place the total machine energy only deviates by less than 1%
The containers get up to 30% though!
Reason for that being is that the machine to be used is a represeantative for a user system and thus has DVFS, TurboBoost, HyperThreading turned on. This leads to CPU utilization becoming quite flaky.
See also our case study here: https://www.green-coding.io/case-studies/cpu-utilization-usefulness/
Eco-CI Output: | Label | ๐ฅ avg. CPU utilization [%] | ๐ Total Energy [Joules] | ๐ avg. Power [Watts] | Duration [Seconds] |
---|---|---|---|---|---|
Total Run | 22.4451 | 1515.15 | 3.45924 | 446 | |
Measurement #1 | 22.6363 | 1515.15 | 3.45924 | 438 |
๐ Energy graph:
8.18 โค โญโโโโฎ
7.54 โค โ โ
6.90 โค โญโฎโ โ
6.26 โค โญโฎ โญโฎ โโโ โ
5.62 โค โโ โญโฏโฐโฎ โโฐโฏ โ
4.97 โค โญโฎ โโฐโฎ โญโฎ โ โฐโฎ โ โ
4.33 โค โญโโฎ โโ โญโฎ โญโฎ โ โโญโโโฏโ โ โ โ โ โญโฎ โญโฎ โญโฎ โญโโโโโโฎ โญโฎ โญ
3.69 โค โญโโฎโญโโโโฎโญโโโโโโโโฏ โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ โโฐโโฏโฐโโโโโโโโโโโโโโโโโโโโโโโฏโฐโโโโโโโโโโฏ โฐโฏ โฐโโฏ โฐโฎโญโฏ โ โโฐโโโโโฏโฐโโโฎ โญโโฎ โญโฎ โญโโโโฎ โญโฎ โญโฎ โญโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏโฐโโโฏ โฐโโโฎ โญโโโโโโโโโโโฎ โญโโโฎ โญโโโฎ โญโโโโโโฎ โญโโฎ โญโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ โญโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏโฐโโโโโโโโโโฏ
3.05 โค โ โโ โโ โฐโโฏ โโ โ โญโฏ โ โ โฐโฎ โโ โ โฐโฎ โโ โโ โ โ โ โ โ โ โ โ โ โ โญโฏ โ โ โ โ
2.41 โค โ โโ โโ โโ โ โ โฐโฎ โ โ โโ โ โ โญโฏโโญโฎ โโฐโฎโ โ โ โ โ โ โ โ โ โ โ โ โ โฐโฎโ
1.77 โผโโโโโฏ โฐโฏ โฐโฏ โฐโฏ โฐโโโโโฏ โฐโโโโโโโโฏ โฐโโโโโโโโโฏโฐโโฏ โฐโโโโโโโโฏ โฐโฏโฐโโโโโโโโฏ โฐโฏ โฐโโโโโโโโโโฏ โฐโโโโโโโโฏ โฐโโโโโโโโฏ โฐโโโโโโโโโโฏ โฐโโฏ โฐโโโโโโโโโโฏ โฐโฏ
Watts over time
๐ณ CO2 Data: City: Boydton, Lat: 36.677696, Lon: -78.37471 Carbon Intensity for this location: 362 gCOโeq/kWh SCI: 0.548484 gCOโeq / pipeline run emitted
Another update on this: I tuned our servers and installed a new machine that is more suitable for this type of benchmarking and are now seeing values which I deem "acceptable".
I got the StdDev for repeated measurements down to ~5%
This was achieved by:
The machine is very representative of a classic shared server machine in the cloud now.
To validate I also choose a load test with jMeter that also shows similar values:
IMHO this PR is now ready to merge. @mrchrisadams let me know if you have any remarks, then I will bring this functionality also to the Github Codespace.
Excellent news - thanks @ArneTR !
Can I check which machine this is running on now?
I'm assuming it's this one, and that would be helpful for fielding any questions in the workshop on Friday.
CO2 Benchmarking (DVFS OFF, TB OFF, HT OFF) - TX1330 M2 Use Case: For benchmarking of a software where configuration is tuned for reproducability Vendor: Fujitsu TX1330 M2 OS: Ubuntu 24.04 (NOP Linux) Type: Single-Tenant Server CPU: Intel(R) Xeon(R) CPU E3-1240L v5 @ 2.10GHz Cores: 4 Threads: 4 Hyper-Threading: Off Turbo Boost: Off DVFS: Off (Fixed to 2.1 GHz) C-States: C0 only Memory: 8 GB Sample measurement with machine specs Metrics Provider for Machine Power: MCP39F511N
Otherwise, I can see this PR being useful already, and would find it very helpful to have available. I'd be very happy for it to be merged in.
Correct the "CO2 Benchmarking" is the machine to choose. It is activated also on the "CO2 profiling", but values will be not as reliable (just in case you run into congestion because of too many handed in tests :) )
I have further reduced the boot up time of our machines to < 5 Minutes. Once they are up they will pick up tests instantly.
Fantastic! Thanks Arne for implementing this ๐
Regarding the implementation, the following thoughts came to my mind:
One question regarding the frontend: Is it planned to add charts to the stats page of a single run that displays the results for container power and container energy?
btw: I also like the new naming and the description of the machines of the measurement cluster. Now it is easier to decide which of the machines is suitable for the measurement I want to run.
Fantastic! Thanks Arne for implementing this ๐
Regarding the implementation, the following thoughts came to my mind:
- Container power is only calculated, if machine power is available (either mcp or xgboost is required, right?). So RAPL is not sufficient, a difference to Scaphandre.
Correct!
- Scaphandre calculates the power per process for each jiffy (small time period) and sums it up in the end. Your implementation calculates the power per container in the end using the the total power consumption and the average CPU utilization over the whole time period. I would assume this makes a difference, e.g. because of energy proportionality.
In theory if you could actually measure it then yes. But since neither Scaphandre nor we account for energy proportionality "integrating" or using the average would get you the same value. If you would chain Cloud Energy with every CPU% value then you would account for the energy proportionality but would have an estimation instead of a linear attributed measurement.
- As you have already mentioned at other places, the usage of CPU instructions (like Kepler does it) instead of time / utilization would be preferable. Is the reason, that you don't use CPU instructions because of the complexity and the effort that is needed to implement it?
Jap, that is quite more complex. But doable. We will try to use the Green Kernel Plugin that we create to procure this value then. Happy if you say hello in the repository and also raise questions / wishes there if you have any! https://github.com/green-kernel
One question regarding the frontend: Is it planned to add charts to the stats page of a single run that displays the results for container power and container energy?
Not planned atm. reason being that the value is not the best metric to work with in the first place and for smaller chunks of time it becomes even more bogus. It can give a nice orientation as an average value, but for small time chunks I forsee it not really usable as kernel time tracking here is also not guaranteed to have such high resolutions. Having said that: I am open to bringing it in after a proper analysis, but this will take some time.
btw: I also like the new naming and the description of the machines of the measurement cluster. Now it is easier to decide which of the machines is suitable for the measurement I want to run.
ty!
This PR brings a power estimation feature per container to the GMT.
The approach is similar in it's idea on how Scaphandre does it:
@mrchrisadams - Is that what you were looking for?
@davidkopp - Also would love your feedback on this as we have talked about this before.
A bit more context: Initially the philosphy of the GMT is not to have these values in as they provide limited insights. We have written are more detailed piece on this here:
Since some of you know we have been funded by Mozilla to make a kernel Energy Plugin - https://github.com/orgs/green-kernel/discussions/1 - Say hi in the thread if you like!
Since the plugin is coming soon we wanted to have some comparative functionality already in GMT to see how heaviliy the values will deviate. Still the given caveats apply. We will also add some documenation on this soon.
Love your feedback on this!
Demo measurement