Closed kedmison closed 4 years ago
@pgoyal01 @karinesevilla @iangardner22 @CsatariGergely @tomkivlin @markshostak @ASawwaf
I'm raising this issue because the RM has not completed the action about setting cpu allocation ratio to 1:1 for basic, and that the RA1 now has a pull request that has both a basic flavour with 1:1 and a basic flavour with 4:1 cpu allocation ratios.
I would like to ensure we have agreement on going forward with moving basic to 1:1 allocation ratio as was discussed, and that the RM can then provide some clearer perspective to RA1 issue #1387.
we can take a simple vote with: Thumbs up: agree Thumbs down: disagree and of course, comments and discussion are welcome.
based on CH02 : Basic: for VNFCs that can tolerate resource over-subscription and variable latency, and we list an example of NMS , AAA , Which the performance does not matter so to differentiate between basic and Network, i will go with 1:4 for basic
I wasn't in Prague when the decision, was made, and have not seen any reasonable justification for the decision, and so I will not vote on this. Please remember that is addition to some VNFs there are associated software (for example, services portals) that are perfectly OK with higher CPU Allocation Ratios.
May I suggest that the CNTT take the following position -- critique/alternate suggestions welcome:
BTW nowhere do we address Memory/Storage over allocation ratios. The OpenStack suggested Memory Allocation Ratio is 1.5:1.
PS Basic doesn't replace Compute Intensive. If I remember correctly, Compute Intensive had CPU Pinning, NUMA, huge pages and also some network throughput requirements.
Can we not define both. Basic1 (1:1) and Basic4 (4:1)? I don't recall the rationale for removing one of the key benefits of virtualisation unless predictable performance is an overarching mandate for CNTT RM for all use cases? I get it for VNFs but not for supporting services that spend a lot of time idle.
This is a good question. I'll try to explain why I think it's a bad idea. I believe that this either strands resources or requires a definition of a new host-group with a different amount of physical memory. 1) I may be missing something, but the only way I know of, in Openstack, to apply an overcommit is at the compute host level (i.e. to all VMs). Basic1 and Basic4 could thus not co-exist on the same host, and so different host-aggregates would need to be defined, simply because of this differing host configuration. 2) The CNTT profiles are on a ratio of 1 vCPU:2GB RAM. Assume a physical host of 2x24 cores, x2 threads/core. 96 cores total. Assuming 1:2 vCPU:GB Ratio needed for Basic1, the Basic hosts would need 192GB RAM. The 'small' variant is 1 vCPU to 2GB RAM. So, this host would be able (ignoring openstack reserved cpus for the moment) be able to host 96x Basic1.small VMs. But, if we then take this and deploy all Basic4.small VMs on it, we would run out of memory at 192/2=96 VMs... same as Basic1. This happens because we're not overcommitting on memory. So, a Basic4 host, that overcommits on vCPU but not on memory, would need 4x the RAM: 768GB, in order to fully deploy VMs that are committed to that vCPU ratio without also overbooking memory. 3) If we then sized the basic hosts to support overbooking, and built them with 96 threads and 768GB RAM, using it for Basic1 would see 75% of the RAM be unallocated. 4) that if we had such a compute host that was 1/2 populated with Basic4, there would be (96 4 50%) 192 overbooked Basic4 instances competing for 96 cores. If we then placed a Basic1 VM on it, then by virtue of the fact that it was also free-floating, it would be competing for the same 96 cores with the 192 vCPUs of the Basic4 VMs... and thus the Basic1 would not in fact be a 1:1 overbooked due to competition for vCPUs, but instead be a roughly 1:2 overbooked VM.
So, I'm arguing that Basic1 profiles and a Basic4 profiles
and thus, that we should thus not have both a Basic1 and a Basic4.
I don't recall the discussion in Prague so apologies if I'm repeating what was already brought up. My concerns with this PR and associated issue are:
Tom raised a good point and I think now that CPU allocation ratio should not be part of RM. When we start writting the RM, we had OpenStack in mind and what's why the parameter was documented in RM. It may be not relevant regarding Containers on bare metal. Configuring properly the CPU allocation ratio for an OpenStack infra is important to change the default ratio of 16:0 and It must remain in RA1.
@kedmison
I would like to ensure we have agreement on going forward with moving basic to 1:1
Kelvin, This was discussed ad nauseam between Prague and the RM mtgs, and then again discussed and approved by the TSC. It appears you created an issue to correct a defect (PR #1064 didn't update the 1:1 ratio), and you're attempting to use it to make a functional change (i.e add an over-subscription capability). If we want to reintroduce over-subscription, that's fine, but given the amount of effort we just went through to remove it, be sure to open an issue for that purpose, document the rationale, and add it to the agenda. My 2 cents...
@iangardner22
Can we not define both. Basic1 (1:1) and Basic4 (4:1)? I don't recall the rationale for removing
Hi Ian, We can, but...The rationale was along the lines of simplification and focus for CNTT. An Operator could certainly add such a profile if they wanted to, but given the target workloads for CNTT, it didn't sound like the effort to design and more importantly, validate the extra profile could be justified at this time. Later, once the primary workloads are humming, I see no issue with adding over-subscription. Note, the CI was only parked, not deleted.
@tomkivlin Hi Tom, To cherry pick some of your questions: WRT #1, Depends what you mean by "this"... While this Issue (1424) should definitely not delve into the performance testing, there is an infra characterization component to CNTT, and it's a very long, ongoing topic. The jist of it is, not that CNTT sets criteria for "testing" of performance, but that CNTT provides environments optimized for the performance of different types of workloads (e.g., Network Intensive, not Network Intensive (i.e. Basic), etc.), as well as permutations of RAM and other resources. Unfortunately, it was pointed out that the models were insufficiently granular to provide deterministic performance for workloads. Again, assuming the NI environment is intended to provide enhanced performance for network intensive workloads, the model had difficultly capturing how "enhanced" the environment actually was. This led to trying to specify HW SKUs, CPU clock speed and other minutia that CNTT cannot spec. Hence, the Reference VNF^H^H^HWorkload, aka Golden VNF^H^H^HWorkload, and other tests were conceived to help establish a baseline to normalize a unit of CNTT "work". It's still not carved in stone, nor guaranteed to succeed, but it's the current starting point to close the loop and provide deterministic performance from CNTT models on an Operator's specific hardware. It's a deep and frequent topic in the RM mtg. Feel free to join in.
WRT #3, parking over-subscription was a means to focus CNTT on infra for the primary use cases, so our limited resources aren't spread too thin. Note, the intent and language was always to "park" Compute Intensive, not to delete it. Once we have finished everything specific to target workloads, the scope could be widened again, and in the interim, there's no reason a given Operator couldn't add an oversubscribed profile for their infra, but to your question, the value is in staying focused and not trying to boil the ocean.
WRT #4, in a word, no, not from now on. See above.
WRT #5, totally agree. We'd love to hear any suggestions on how the Models' attributes could be modified or augmented to better address the needs of Containerized workloads. We're always updating the language to be more generic, but that's just generic "paint" to avoid precluding certain use cases. What would be really valuable, would be enhancements to normalize existing attributes across types of workload, as well as additional attributes, that may or may not be specific to Containers. Again, we're happy to discuss/brainstorm any time.
@kedmison Kelvin, Any comments or clarifications?
Thanks, -Mark
Thanks Mark, I understand the background and sorry for coming in at the 11th hour. I totally get why compute intensive was parked and fully support the simplification points. What I'm still not getting is why subscription ratios are part of the software profile as it would appear to me to mean that all implementations and deployments would need to use this ratio. Perhaps we should add something to say this is optional for an operator deployment?
For example, I understand why we are defining standard abstracted capabilities and interfaces in the RM - my understanding of the goal of CNTT (and then OVP 2.0) is to enable software and infra vendors to know what each other are developing against and to enable interoperability benefits for both parties and the operators.
I need to read around more of the performance benchmarking aspects of CNTT and OVP but to me it either feels like a step too far, or that we have got quite a big gap in our documentation and capability today around that part.
@tomkivlin Hi Tom,
why subscription ratios are part of the software profile
I'm not sure why they were originally, but my impression of the intent in Issue 973 was to remove over-subscription ratios, as I'm looking at it as we removed over-subscription. Full stop. As opposed to, we're specifying a 1:1 over-sub ratio. In other words, either you specify a 1:n over-sub ratio, or you turn the over-sub feature off (1:1). Granted, it's a fine line...
As for the appropriateness of the attribute being intrinsic to an IT, as opposed to being part of an extension or a "knob", I'll defer to Kelvin @kedmison to comment on, as I suspect the best answer/solution will be rooted largely in the OpenStack API, with the exception of the bare-metal use case.
feels like a step too far, or that we have got quite a big gap in our documentation
Probably a combination of the two. :-) As described above, the characterization is a compromise solution. As far as the doc gap, if you're referring to documenting the characterization methodology or instrumentation, the RM team is about to, or hopefully, already has written the first draft of the guidelines. @kedmison how is that coming along? :-)
Thanks, -Mark
The intent was not to 'remove oversubscription' from the RM, (thus leaving it undefined and open to interpretation/customization) but to in effect ensure that oversubscription was not taking place. From a Kubernetes/RA2 perspective, compliance should not be an issue here. From a RA1/Openstack perspective, I think there wind up being bin-packing problems that I elaborated earlier that wind up reducing or eliminating altogether the benefits of over-subscription.
The intent was not to 'remove oversubscription' from the RM, (thus leaving it undefined and open to interpretation/customization) but to in effect ensure that oversubscription was not taking place. From a Kubernetes/RA2 perspective, compliance should not be an issue here. From a RA1/Openstack perspective, I think there wind up being bin-packing problems that I elaborated earlier that wind up reducing or eliminating altogether the benefits of over-subscription.
Thanks @kedmison. My concern is that as things are worded/presented, it seems to me that an operator who chooses to over-subscribe, which is a valid design choice, would be non-conformant with CNTT? Is that the intent?
If this is solely being stated with a view to allow the Reference Implementation to be used as a baseline for later performance testing, then that's fine, but I'd rather see a note stating that Vendor Implementations can allow over-subscription and individual operators can choose to use over-subscription if they choose to do so.
@tomkivlin The primary argument is that some workloads will require an expectation of performance from Basic, and cannot tolerate overbooking. So, from an RM perspective, yes, I think that the operators must make available a non-over-subscribed basic, and if they want to have an additional, overbooked variant of Basic, I believe it should be outside of CNTT but an operator is not prevented from doing so.
@pgoyal01 @tomkivlin I see in RA1 that Table 4.2 has already defined a B1 with no overbooking and a B4 with overbooking. Can we align on the RM specifying no overbooking (to be consistent across RA1 and RA2) and then the RA1 allowing the use of overbooking by specifying an additional Basic4 flavour there that is not specified at the RM level?
@pgoyal01 @tomkivlin I see in RA1 that Table 4.2 has already defined a B1 with no overbooking and a B4 with overbooking. Can we align on the RM specifying no overbooking (to be consistent across RA1 and RA2) and then the RA1 allowing the use of overbooking by specifying an additional Basic4 flavour there that is not specified at the RM level?
@kedmison - I agree there needs to be consistency. For me, I would be happy with your proposed approach if we could add a note saying the purpose of this is for performance benchmarking purposes, meaning operators can choose different ratios if they accept the risk of an impact on performance. I'll suggest a change in the PR. Once that PR is merged, I'll do one to update RA2.
@kedmison RM specified a 4:1 allocation ratio (and it is also in RI-1) and when you started that there should be a no over allocation Basic, we decided to include both while a decision was made on which is it to be.
In RA-1 we can add a note about performance implications -- BTW not necessarily the case if the application really doesn't need the capacity. Any adds will now be in the next release.
Issue #973 documented the intent to both park compute intensive flavour and to remove oversubscription from the Basic flavour.
The Parking of compute-intensive was done, but the removal of oversubscription was not. This issue addresses the removal of over-subscription by setting CPU allocation ratio to 1:1 for Basic flavour.