Closed xavier-grall closed 5 years ago
@xavier-grall Some minor suggestions and typo corrections"
@xavier-grall refining proposal (#555 (comment)): "VNFCs that perform compute intensive operations (but do not have neither high network throughput nor low network latency requirements)" Suggest that you delete "do not".
B/ For Basic profile @kedmison proposal (#555 (comment)): "VNFCs that perform basic compute operations and can tolerate oversubscription the variable compute latency" and @xavier-grall suggestion: "VNFCs that perform basic compute operations without any specific requirement" Suggestion: "VNFCs that perform compute operations that can tolerate resource over-subscription and variable latency."
@xavier-grall thanks for your effort, i like your summarization for CI , I will go with my definition :)
@xavier-grall Yes, I think CI should have a low latency requirement. The tables we are generating in chapter 5 indicate that it will have SRIOV. In fact, there is little to no feature differentiation in the networking space between network-optimized and compute-optimized.
We could create some differentiation by specifying that the network-optimized flavours have more tenant bandwidth per VM. Profile descriptions taking this into account might look like this:
- Basic: VNFCs that can tolerate resource over-subscription and variable latency.
- Network Intensive: VNFCs that require high network throughput and low network latency.
- Compute Intensive: VNFCs that require low network latency.
I'd further argue that this definition of Compute intensive is in fact a 'Balanced' configuration, and that IT or ML workloads that we plan to address in future would need a separate 'compute intensive' profile that has faster clockspeeds (i.e. faster single-thread speeds) and/or more cores per VM. So, it may be worthwhile looking at the naming of this particular profile now, to ensure we don't create nomenclature challenges for ourselves when the time comes to address these common IT and ML needs.
@kedmison @pgoyal01 @ASawwaf Thank you for your clear opinions and proposals The essential remaining difference in our opinions is about the network latency for CI. As I said in PR #555 discussion, I am not so sure of my own opinion on that point, so I agree to add the low network latency requirement. For the language, I think the description should be as accurate as possible, especially for future readers, so I would propose to keep the level of compute-related operations:
@kedmison Regarding the solution for addressing CI latency requirement, I would rather think about ovs-dpdk with only a few cores for pmd threads (eg 1 per numa node), since sriov requires the VNF to have implemented a nic-specific driver, which may be very (too much) constraining. But, if necessary, it will have to be discussed in another issue related to another chapter ;-). Regarding IT/ML/AI, I fully approve to consider these workloads in a future release, and also that they may required a new profile (possibly including specific hardware offloading).
Regarding IT/ML/AI, I fully approve to consider these workloads in a future release, and also that they may required a new profile (possibly including specific hardware offloading).
I was trying to leave some room in the nomenclature to migrate to something like the following:
Basic: VNFCs that can tolerate resource over-subscription and variable latency. Network Intensive: VNFCs that require high network throughput and low network latency. Balanced (our current definition of Compute-intensive): VNFCs that require low network latency. Future: Compute Intensive: VNFCs that require high core counts and high single-threaded performance Future: Storage Intensive: VNFCs that require large amounts of locally attached storage and/or high storage IOPS Future: Graphics-accelerated: VNFCs that require GPU acceleration
Right now, our 'compute intensive' is essentially only different from 'basic' in that it's not overbooked. Otherwise, it's not that special except in the networking terms.
If we adopt this sort of framework above, then we have
It is in thinking about this sort of nomenclature that I have concerns about our current definitions of network-intensive and compute-intensive and calling them both appropriate for compute-intensive operations.
OK I missed your previous point about possible future naming challenge... I think it would be great to be able to keep the current naming, and thus to try to find another name for a possible future ML/AI profile: why not just Enhanced CI (CI+) ? And we could also have Enhanced NI. Regarding GPU, it could be a feature of CI+, as crypto acceleration could be for NI+. Concerning compute intensive operations I think they refer to a requirement for predictable/determinist compute performance (or dedicated compute resource). So, trying to keep high level requirements (and to avoid technical solution options), I propose:
Possible future profiles:
@xavier-grall I like it; I think that's a very good compromise between keeping the existing naming and creating space for the future workloads, and a good observation about 'predictable' compute.
(The only quibble I have is that I'd disagree on the GPU as part of CI+ as CI is about general-purpose compute resource (CPUs), not GPUs... but that's for future RM discussions and 'issues'.)
OK great Do you think the figures should also be reviewed consequently ? Currently, they are : https://github.com/cntt-n/CNTT/blob/xavier-grall-patch-ch02/doc/ref_model/figures/ch02_infra_profiles.PNG
@pgoyal01 @ASawwaf @ulikleber What do you think about proposed naming and descriptions ?
Yes, I think so. The bold text in the diagram is not quite aligned with the proposed profile descriptions.
@xavier-grall , I am ok with the workload profile examples, only comment as @kedmison the bold description is not matching Thanks
I think it looks good. I don't think we will find many VNFs in Basic in the end. But let's go for it.
@ulikleber Right, we will not find many entire VNFs in Basic in the end, but we should find many VNFCs, especially those related to VNF management planes.
@kedmison @ASawwaf For the figures, I suggest to replace the bold text of NI & CI profile like this:
PR #585 created, including figures update proposal
@xavier-grall Ready to close this one? If so, I'd like to get it off tomorrow's gov report. :-)
Of course, PR is merged, so issue can be closed (done right now)
From https://github.com/cntt-n/CNTT/pull/555#issuecomment-550338405: Kelvin and Xavier to update language for profiles.
There are 2 pending issue related to language for profiles, one for compute intensive and another one for basic:
A/ For Compute Intensive profile
Pankaj proposal (https://github.com/cntt-n/CNTT/issues/536#issuecomment-549826699): VNFCs that perform compute intensive operations but do not have high network throughput and low latency requirements
Xavier refining proposal (https://github.com/cntt-n/CNTT/pull/555#discussion_r342936889): VNFCs that perform compute intensive operations (but do not have neither high network throughput nor low network latency requirements)
Ahmed comment (https://github.com/cntt-n/CNTT/pull/555#discussion_r342938213): this definition is valid for basic or general-purpose profiles but for Control plan VNF ( as IMS ) : I think it is required low latency which should be onboarded on Compute intensive profile
Xavier opinion (https://github.com/cntt-n/CNTT/pull/555#discussion_r342966228): low latency requirements of some control plane functions (as IMS CSCF, MME...) are addressed by providing these functions with sufficient, ie dedicated, NFVI computation resources, but not with specific NFVI networking resources. And these dedicated computation resources are reflected by the "fast computing" requirement of CI profile (eg, provided by cpu pinning)
Pankaj opinion (https://github.com/cntt-n/CNTT/pull/555#discussion_r343141773): the control plane latency requirements will be addressed by CPU and memory.
Ulrich opinion (https://github.com/cntt-n/CNTT/pull/555#discussion_r343049994) many VNFs (better VNFCs) only roughly fall into one of 3 categories and have specific needs to reach their optimum effectivity.
Concluded issue: should the CI profile have a low network latency requirement ?
@kedmison what do you think ?
B/ For Basic profile
What are your opinions ?