Closed samuelrince closed 2 years ago
After some more digging on the data here is what I found out.
First, consumption profiles from the "same type" of instances look similar (at least for c5 instances) so it is a good information to collect.
Second, consumption profiles from different servers with the same memory capacity also look similar.
On the last one there is roughly 40 Watts of difference between the max and min.
There are consumption profiles that are almost constant given any workload:
Are these outliers?
We also see a big difference between using the CPU stress test and other types of stress tests (that put more pressure on the memory).
product_name | ramwatt_cpu_stress_100 | ramwatt_vmstress_100 | diff_vmstress_percent | ramwatt_maximize_100 | diff_maximize_percent |
---|---|---|---|---|---|
c5n.metal | 85.0 | 153.0 | +80% | 169.0 | +99% |
c5.metal | 90.0 | 210.0 | +133% | 210.0 | +133% |
c5.metal* | 94.0 | 214.0 | +128% | 210.0 | +123% |
r5.metal | 277.0 | 510.0 | +84% | 510.0 | +84% |
m5.metal | 132.0 | 360.0 | +173% | 384.0 | +191% |
z1d.metal | 111.0 | 256.0 | +131% | 230.0 | +107% |
m5zn.metal | 89.0 | 184.0 | +107% | 160.0 | +80% |
i3.metal | 22.0 | 52.0 | +136% | 57.0 | +159% |
GP-BM1-S | 2.6 | 5.4 | +108% | 3.8 | +46% |
HC-BM1-XS | 28.0 | 68.0 | +143% | 62.0 | +121% |
HC-BM1-L | 104.0 | 216.0 | +108% | 196.0 | +88% |
Lenovo ST550 | 72.0 | 131.0 | +82% | 108.0 | +50% |
c3.small.x86 | 3.0 | 6.1 | +103% | 5.4 | +80% |
s3.xlarge.x86 | 55.0 | 142.0 | +158% | 110.0 | +100% |
n2.xlarge.x86 | 99.0 | 173.0 | +75% | 230.0 | +132% |
2xIntelGold6230R | 43.0 | 151.0 | +251% | 143.0 | +233% |
This makes me question the relevance of estimating the RAM consumption profile from CPU workload. I guess this may work most of the time for average processing workloads, but if we use a server for a more specific type workload like databases or VMs the profile we will estimate will be way of the underelying reality.
So at a first glance we can say that:
Also I haven't found a lot of data on the memory itself (manufacturer, launch year, etc.). There are some information in the spreadsheet but not enough to estimate anything based on these potential variables.
@github-benjamin-davy I haven't found anything on the type of memory bank used in cloud instances so if you have information on that I am interested.
Some graphs on the potential outliers:
Consumption at 0% is weirdly high compared to others.
Hello @samuelrince thanks a lot for this work! I fully agree with your conclusions, memory consumption will vary a lot depending on the type of workload and there is some form of efficiency (consumption per GB of memory) with newer machines with dense memory DIMMs.
The outliers we see could be related to architectures that didn't properly support RAM consumption reporting with RAPL so on my side I didn't consider them. As well I think that the idle measurement might have some limitations with RAPL (@bpetit do you have feedback on this?). We would need more measurements on other hardware ideally and especially with wattmeters.
Regarding the memory bank info for cloud hardware I started to collect them on the spreadsheet you linked using the dmidecode -t memory
command on bare metal machines. The number of DIMMs should be consistent per generation and memory quantity I guess.
Thank you for your work @samuelrince and your feedback @github-benjamin-davy.
@samuelrince : Do you think that we could implement a first "dump" consumption profile for RAM that generate/use a profile only based on the RAM quantity (based on the RAM stress test ?). Or at least a fix factor proportional to the ram quantity ?
@github-benjamin-davy : To collect more measurement on a variety of hardware, we should let the community conduct the work that you have done on their own servers. Why not adapting your code during a hackathon to write in an open database ?
I was thinking that the feature we are developing could interest different international organization such as cloud carbon footprint, the SDIA and the GSF. They could also make a call to their own communities.
Great idea @da-ekchajzer! Centralizing these measures and having some dataviz to compare several tests could be helpful as well. Ideally, it would be also nice to add AMD machines support.
@samuelrince : Do you think that we could implement a first "dump" consumption profile for RAM that generate/use a profile only based on the RAM quantity (based on the RAM stress test ?). Or at least a fix factor proportional to the ram quantity ?
Thanks @da-ekchajzer, I'll make a prototype as you said and in the future if we gather more data on RAM consumption profiles (with categorical features) we will update the feature accordingly.
According to what we've said, I've finished the implementation of RAM consumption profile as a simple constant function (independent from the workload). The model is determined using RAM capacity only. Let's review this feature when we have more data !
Can we close this issue @samuelrince ?
Problem
Memory (RAM) components electrical consumption depends on the workload. We want to estimate the following function:
$$ CP_{ram} : \mathnormal{workload} \in [0,1] \mapsto \mathnormal{power} \in \mathbb{R^{+}} $$
In our context we can assess the following hypotheses:
This issue is related to #86 where we solved the same problem for CPU consumption profiles.
Solution
Implement a new consumption profile model dedicated to RAM components that will infer the parameters of the consumption profile function described above. It should take into account any relevant features to estimate these parameters.
Possible features:
Using workload data, we can apply a regression algorithm using log-like functions for instance. The categorical features can help select fine-tuned initial parameters for the regression. If no workload is provided, we can use the fine-tuned model as an output.
Additional context or elements