[RM ch02] profiles language update needed

xavier-grall commented 5 years ago

From https://github.com/cntt-n/CNTT/pull/555#issuecomment-550338405: Kelvin and Xavier to update language for profiles.

There are 2 pending issue related to language for profiles, one for compute intensive and another one for basic:

A/ For Compute Intensive profile

Pankaj proposal (https://github.com/cntt-n/CNTT/issues/536#issuecomment-549826699): VNFCs that perform compute intensive operations but do not have high network throughput and low latency requirements
Xavier refining proposal (https://github.com/cntt-n/CNTT/pull/555#discussion_r342936889): VNFCs that perform compute intensive operations (but do not have neither high network throughput nor low network latency requirements)
Ahmed comment (https://github.com/cntt-n/CNTT/pull/555#discussion_r342938213): this definition is valid for basic or general-purpose profiles but for Control plan VNF ( as IMS ) : I think it is required low latency which should be onboarded on Compute intensive profile
Xavier opinion (https://github.com/cntt-n/CNTT/pull/555#discussion_r342966228): low latency requirements of some control plane functions (as IMS CSCF, MME...) are addressed by providing these functions with sufficient, ie dedicated, NFVI computation resources, but not with specific NFVI networking resources. And these dedicated computation resources are reflected by the "fast computing" requirement of CI profile (eg, provided by cpu pinning)
Pankaj opinion (https://github.com/cntt-n/CNTT/pull/555#discussion_r343141773): the control plane latency requirements will be addressed by CPU and memory.
Ulrich opinion (https://github.com/cntt-n/CNTT/pull/555#discussion_r343049994) many VNFs (better VNFCs) only roughly fall into one of 3 categories and have specific needs to reach their optimum effectivity.

Concluded issue: should the CI profile have a low network latency requirement ?

If yes (Ahmed position), CI definition could be: VNFCs that perform compute intensive operations with low network latency requirement (but no high throughput requirement)
if no (Xavier & Pankaj position), it would rather be: VNFCs that perform compute intensive operations (but do not have neither high network throughput nor low network latency requirements)

@kedmison what do you think ?

B/ For Basic profile

Kelvin proposal (https://github.com/cntt-n/CNTT/pull/555#discussion_r343129341): VNFCs that perform basic compute operations and can tolerate oversubscription the variable compute latency
Xavier comment: Not sure to understand the addition which seems to give the technical solution (ie a possible oversubscription ratio) and not to be a requirement. If needed, I would suggest to only add: VNFCs that perform basic compute operations without any specific requirement

What are your opinions ?

pgoyal01 commented 5 years ago

@xavier-grall Some minor suggestions and typo corrections"

@xavier-grall refining proposal (#555 (comment)): "VNFCs that perform compute intensive operations (but do not have neither high network throughput nor low network latency requirements)" Suggest that you delete "do not".

B/ For Basic profile @kedmison proposal (#555 (comment)): "VNFCs that perform basic compute operations and can tolerate oversubscription the variable compute latency" and @xavier-grall suggestion: "VNFCs that perform basic compute operations without any specific requirement" Suggestion: "VNFCs that perform compute operations that can tolerate resource over-subscription and variable latency."

ASawwaf commented 5 years ago

@xavier-grall thanks for your effort, i like your summarization for CI , I will go with my definition :)

kedmison commented 5 years ago

@xavier-grall Yes, I think CI should have a low latency requirement. The tables we are generating in chapter 5 indicate that it will have SRIOV. In fact, there is little to no feature differentiation in the networking space between network-optimized and compute-optimized.

We could create some differentiation by specifying that the network-optimized flavours have more tenant bandwidth per VM. Profile descriptions taking this into account might look like this:

Basic: VNFCs that can tolerate resource over-subscription and variable latency.

Network Intensive: VNFCs that require high network throughput and low network latency.

Compute Intensive: VNFCs that require low network latency.

I'd further argue that this definition of Compute intensive is in fact a 'Balanced' configuration, and that IT or ML workloads that we plan to address in future would need a separate 'compute intensive' profile that has faster clockspeeds (i.e. faster single-thread speeds) and/or more cores per VM. So, it may be worthwhile looking at the naming of this particular profile now, to ensure we don't create nomenclature challenges for ourselves when the time comes to address these common IT and ML needs.

xavier-grall commented 5 years ago

@kedmison @pgoyal01 @ASawwaf Thank you for your clear opinions and proposals The essential remaining difference in our opinions is about the network latency for CI. As I said in PR #555 discussion, I am not so sure of my own opinion on that point, so I agree to add the low network latency requirement. For the language, I think the description should be as accurate as possible, especially for future readers, so I would propose to keep the level of compute-related operations:

Basic: VNFCs that perform basic compute operations and can tolerate resource over-subscription and variable latency.
Network Intensive: VNFCs that perform compute intensive operations and require high network throughput and low network latency.
Compute Intensive: VNFCs that perform compute intensive operations and require low network latency.

@kedmison Regarding the solution for addressing CI latency requirement, I would rather think about ovs-dpdk with only a few cores for pmd threads (eg 1 per numa node), since sriov requires the VNF to have implemented a nic-specific driver, which may be very (too much) constraining. But, if necessary, it will have to be discussed in another issue related to another chapter ;-). Regarding IT/ML/AI, I fully approve to consider these workloads in a future release, and also that they may required a new profile (possibly including specific hardware offloading).

kedmison commented 5 years ago

Regarding IT/ML/AI, I fully approve to consider these workloads in a future release, and also that they may required a new profile (possibly including specific hardware offloading).

I was trying to leave some room in the nomenclature to migrate to something like the following:

Basic: VNFCs that can tolerate resource over-subscription and variable latency. Network Intensive: VNFCs that require high network throughput and low network latency. Balanced (our current definition of Compute-intensive): VNFCs that require low network latency. Future: Compute Intensive: VNFCs that require high core counts and high single-threaded performance Future: Storage Intensive: VNFCs that require large amounts of locally attached storage and/or high storage IOPS Future: Graphics-accelerated: VNFCs that require GPU acceleration

Right now, our 'compute intensive' is essentially only different from 'basic' in that it's not overbooked. Otherwise, it's not that special except in the networking terms.

If we adopt this sort of framework above, then we have

Balanced as our baseline configuration of a balanced allocation of compute, RAM, storage, and network
Basic as an overbooked version of Balanced (reduced compute capabilities)
Network-Intensive: extending 'balanced' by improving the network resource dimension
Compute-intensive: extending 'balanced' by improving the compute resource dimension
Storage-intensive: extending 'balanced' by improving the storage resource dimension
Graphics-accelerated: extending 'balanced' by adding GPUs

It is in thinking about this sort of nomenclature that I have concerns about our current definitions of network-intensive and compute-intensive and calling them both appropriate for compute-intensive operations.

xavier-grall commented 5 years ago

OK I missed your previous point about possible future naming challenge... I think it would be great to be able to keep the current naming, and thus to try to find another name for a possible future ML/AI profile: why not just Enhanced CI (CI+) ? And we could also have Enhanced NI. Regarding GPU, it could be a feature of CI+, as crypto acceleration could be for NI+. Concerning compute intensive operations I think they refer to a requirement for predictable/determinist compute performance (or dedicated compute resource). So, trying to keep high level requirements (and to avoid technical solution options), I propose:

Basic: for VNFCs that can tolerate resource over-subscription and variable latency.
Network Intensive: for VNFCs that require predictable compute performance, high network throughput and low network latency.
Compute Intensive: for VNFCs that require predictable compute performance and low network latency.

Possible future profiles:

Storage Intensive: for VNFCs that require low storage latency and/or high storage IOPS.
Enhanced Compute Intensive: for compute intensive VNFCs that require higher compute performance and/or specific compute resource (e.g., GPU).
Enhanced Network Intensive: for network intensive VNFCs that require higher network performance and/or specific network resource (e.g., crypto acceleration).

kedmison commented 5 years ago

@xavier-grall I like it; I think that's a very good compromise between keeping the existing naming and creating space for the future workloads, and a good observation about 'predictable' compute.
(The only quibble I have is that I'd disagree on the GPU as part of CI+ as CI is about general-purpose compute resource (CPUs), not GPUs... but that's for future RM discussions and 'issues'.)

xavier-grall commented 5 years ago

OK great Do you think the figures should also be reviewed consequently ? Currently, they are : https://github.com/cntt-n/CNTT/blob/xavier-grall-patch-ch02/doc/ref_model/figures/ch02_infra_profiles.PNG

@pgoyal01 @ASawwaf @ulikleber What do you think about proposed naming and descriptions ?

kedmison commented 5 years ago

Yes, I think so. The bold text in the diagram is not quite aligned with the proposed profile descriptions.

ASawwaf commented 5 years ago

@xavier-grall , I am ok with the workload profile examples, only comment as @kedmison the bold description is not matching Thanks

ulikleber commented 5 years ago

I think it looks good. I don't think we will find many VNFs in Basic in the end. But let's go for it.

xavier-grall commented 5 years ago

@ulikleber Right, we will not find many entire VNFs in Basic in the end, but we should find many VNFCs, especially those related to VNF management planes.

@kedmison @ASawwaf For the figures, I suggest to replace the bold text of NI & CI profile like this:

NI
- Current Fast computing High throughput & low latency networking
- Proposal Predictable computing High throughput & low latency networking
CI
- Current Fast computing
- Proposal Predictable computing Low latency networking

xavier-grall commented 5 years ago

PR #585 created, including figures update proposal

markshostak commented 5 years ago

@xavier-grall Ready to close this one? If so, I'd like to get it off tomorrow's gov report. :-)

xavier-grall commented 5 years ago

Of course, PR is merged, so issue can be closed (done right now)

anuket-project / anuket-specifications

[RM ch02] profiles language update needed #571