Boavizta / boaviztapi

🛠 Giving access to BOAVIZTA reference data and methodologies trough a RESTful API
GNU Affero General Public License v3.0
73 stars 23 forks source link

Estimate RAM consumption profile from discrete workload data points #115

Closed samuelrince closed 2 years ago

samuelrince commented 2 years ago

Problem

Memory (RAM) components electrical consumption depends on the workload. We want to estimate the following function:

$$ CP_{ram} : \mathnormal{workload} \in [0,1] \mapsto \mathnormal{power} \in \mathbb{R^{+}} $$

In our context we can assess the following hypotheses:

This issue is related to #86 where we solved the same problem for CPU consumption profiles.

Solution

Implement a new consumption profile model dedicated to RAM components that will infer the parameters of the consumption profile function described above. It should take into account any relevant features to estimate these parameters.

Possible features:

Using workload data, we can apply a regression algorithm using log-like functions for instance. The categorical features can help select fine-tuned initial parameters for the regression. If no workload is provided, we can use the fine-tuned model as an output.

Additional context or elements

samuelrince commented 2 years ago

After some more digging on the data here is what I found out.

First, consumption profiles from the "same type" of instances look similar (at least for c5 instances) so it is a good information to collect.

Screen Shot 2022-09-01 at 7 19 00 PM

Second, consumption profiles from different servers with the same memory capacity also look similar.

Screen Shot 2022-09-01 at 7 21 25 PM Screen Shot 2022-09-01 at 7 21 10 PM

On the last one there is roughly 40 Watts of difference between the max and min.

There are consumption profiles that are almost constant given any workload:

Are these outliers?

We also see a big difference between using the CPU stress test and other types of stress tests (that put more pressure on the memory).

product_name ramwatt_cpu_stress_100 ramwatt_vmstress_100 diff_vmstress_percent ramwatt_maximize_100 diff_maximize_percent
c5n.metal 85.0 153.0 +80% 169.0 +99%
c5.metal 90.0 210.0 +133% 210.0 +133%
c5.metal* 94.0 214.0 +128% 210.0 +123%
r5.metal 277.0 510.0 +84% 510.0 +84%
m5.metal 132.0 360.0 +173% 384.0 +191%
z1d.metal 111.0 256.0 +131% 230.0 +107%
m5zn.metal 89.0 184.0 +107% 160.0 +80%
i3.metal 22.0 52.0 +136% 57.0 +159%
GP-BM1-S 2.6 5.4 +108% 3.8 +46%
HC-BM1-XS 28.0 68.0 +143% 62.0 +121%
HC-BM1-L 104.0 216.0 +108% 196.0 +88%
Lenovo ST550 72.0 131.0 +82% 108.0 +50%
c3.small.x86 3.0 6.1 +103% 5.4 +80%
s3.xlarge.x86 55.0 142.0 +158% 110.0 +100%
n2.xlarge.x86 99.0 173.0 +75% 230.0 +132%
2xIntelGold6230R 43.0 151.0 +251% 143.0 +233%

This makes me question the relevance of estimating the RAM consumption profile from CPU workload. I guess this may work most of the time for average processing workloads, but if we use a server for a more specific type workload like databases or VMs the profile we will estimate will be way of the underelying reality.

So at a first glance we can say that:

Also I haven't found a lot of data on the memory itself (manufacturer, launch year, etc.). There are some information in the spreadsheet but not enough to estimate anything based on these potential variables.

@github-benjamin-davy I haven't found anything on the type of memory bank used in cloud instances so if you have information on that I am interested.

samuelrince commented 2 years ago

Some graphs on the potential outliers:

Consumption at 0% is weirdly high compared to others.

Screen Shot 2022-09-01 at 8 07 43 PM Screen Shot 2022-09-01 at 8 10 30 PM
github-benjamin-davy commented 2 years ago

Hello @samuelrince thanks a lot for this work! I fully agree with your conclusions, memory consumption will vary a lot depending on the type of workload and there is some form of efficiency (consumption per GB of memory) with newer machines with dense memory DIMMs. The outliers we see could be related to architectures that didn't properly support RAM consumption reporting with RAPL so on my side I didn't consider them. As well I think that the idle measurement might have some limitations with RAPL (@bpetit do you have feedback on this?). We would need more measurements on other hardware ideally and especially with wattmeters. Regarding the memory bank info for cloud hardware I started to collect them on the spreadsheet you linked using the dmidecode -t memory command on bare metal machines. The number of DIMMs should be consistent per generation and memory quantity I guess.

da-ekchajzer commented 2 years ago

Thank you for your work @samuelrince and your feedback @github-benjamin-davy.

@samuelrince : Do you think that we could implement a first "dump" consumption profile for RAM that generate/use a profile only based on the RAM quantity (based on the RAM stress test ?). Or at least a fix factor proportional to the ram quantity ?

@github-benjamin-davy : To collect more measurement on a variety of hardware, we should let the community conduct the work that you have done on their own servers. Why not adapting your code during a hackathon to write in an open database ?

I was thinking that the feature we are developing could interest different international organization such as cloud carbon footprint, the SDIA and the GSF. They could also make a call to their own communities.

github-benjamin-davy commented 2 years ago

Great idea @da-ekchajzer! Centralizing these measures and having some dataviz to compare several tests could be helpful as well. Ideally, it would be also nice to add AMD machines support.

samuelrince commented 2 years ago

@samuelrince : Do you think that we could implement a first "dump" consumption profile for RAM that generate/use a profile only based on the RAM quantity (based on the RAM stress test ?). Or at least a fix factor proportional to the ram quantity ?

Thanks @da-ekchajzer, I'll make a prototype as you said and in the future if we gather more data on RAM consumption profiles (with categorical features) we will update the feature accordingly.

samuelrince commented 2 years ago

According to what we've said, I've finished the implementation of RAM consumption profile as a simple constant function (independent from the workload). The model is determined using RAM capacity only. Let's review this feature when we have more data !

da-ekchajzer commented 2 years ago

Can we close this issue @samuelrince ?