Closed LedgeDash closed 5 years ago
There are larger cloudlab machines as well... particularly I think in the clemson cluster. If we need it.
@alevy would we need it do you think?
Depends on the numbers. Generally more is probably better in this case, but only if we have the workload to support it. Let's see how the first set of results play out.
I'm trying to implement this but not sure how to set the CPU share for each vm process and the number of vCPU for each vm. If I understand correctly, there are 2 separate variables that we can set: 1. cpu share for the VM process controlled through Cgroup. 2. vcpu_count
in the VmConfig
struct. The two seem to be independent of each other (?).
My question is: for a machine with 164GB memory and 40 cores, how to set the Cgroup cpu share such that a 1792MB VM would get the equivalent of 1 full vCPU (one vCPU-second of credits per second)? Does this mean its CPU share should be 1/40, i.e., a hyperthread? If so, what happens when there are more than 40 VMs of 1792MB running on the system since there's enough memory to do this?
There are two choices we make:
Neither of these is really related to the size of the machine, this the size of the machine does sort of impact how many VMs we can run concurrently and actually provide the CPU shares we're guaranteeing.
So first, high level math.
Assume a 164GB machine with 40 cores (80 hyperthreads). To make the math work out better, and also to reserve some memory for the rest of the system, let's actually use 160GB (which divides eventy both into 80 hyperthreads and into 128MB, and leaves us with a respectable 4GB for non-workload tasks, like the filesystem cache).
160GB / 80 hyperthreads = a VM with 2GB should get a full hyperthread's worth of CPU
Note that this is also true for a 80GB machine with 40 hyperthreads. The important part is the ratio between memory and CPU on the machine, not the absolute values.
Other VM configurations should scale proportionally, so a 128MB VM should get 1/16th the CPU share of a 1GB VM.
Now for settings:
Firecracker VCPUs is a step function.
Any VM <= 2GB should get 1 VCPU, > 2GB but <= 4GB should get 2 VCPUs, > 4GB but <= 6GB 3 VCPUs, etc (if we're only going up to ~3GB, then obviously we only need to worry about 1 vs. 2 VCPUs.
Share of the CPU is relative to other processes, on a scale from 0-100. so it's sufficient to use fixed terms based on the smallest unit:
a 128MB VM gets 1 cpu share a 256MB VM gets 2 cpu shares ... a 2GB VM gets 16 cpu shares ... a 6GB VM gets 48 cpu shares
If we want to support intermediate memory sizes, we can't use fractions, but we can just scale all the values up, since they just need to be relative.
Just to make sure I fully understand, a few follow up questions:
At 1,792 MB, a function has the equivalent of 1 full vCPU (one vCPU-second of credits per second). the "1 full vCPU" means one hyperthread worth of CPU share.
This means for a 20-core (40 hyperthreads) machines, it should get 1/40 of CPU share through Cgroup. Is this understanding correct?
Initially I was thinking about using the total_memory to num_of_hyperthreads ratio to decide what size VM should get a full hyperthread share of CPU (I believe if I understand correctly is what you described in the comment above). However, this would mean that we won't follow exact what Lambda does, which is VMs of 1792MB gets a full hyperthread. Would this be a problem? I can't think of anything but I'm also inexperienced in predicting what reviewers might say.
So if I understand correctly, the algorithm would go something like this:
Correct
The spirit is the same, where I'm assuming that the machines Lambda's run on have a different ratio of memory to CPU (or available memory to available CPU, if the memory and CPU are also used for non-Lambda related tasks). Note that we're not very far off. Lambda's ratio suggests that they have machines with ratios along the lines of 80 hyperthreads and 140GB of memory.
...
Yes
Technically yes, but that's really a coherent way of expressing CPU share to Linux's cgroups subsystem. You express it as a number between 0-100. If two process both have the same number, they get an equal CPU share (i.e. half each). If three have the same number they get an equal CPU share (i.e. a third each). So, literally what I described is what we should do. A 128MB VM gets a CPU "share" of 1
, a 2GB VM gets a CPU share of 16
.
^
Yes
Currently, when a function's CPU field is set to 1, it gets one hyperthread. In other words, on a 20-core machine with hyperthreading, the total number of VMs is capped at 40.
We need to instead control CPU share based on memory requirement. According to Lambda:
The cloudlab machine I have is 164GB with 20 cores (40 hyperthreads). At maximum, it can host 54 VMs of 3008MB memory. This is not an insignificant number of VMs. So I think we could just replicate exactly what Lambda does