anuket-project / anuket-specifications

Anuket specifications
https://docs.anuket.io
123 stars 116 forks source link

[RM] Glossary - EPA definition #592

Closed ASawwaf closed 4 years ago

CsatariGergely commented 5 years ago

I agree that we should define the specific technologies and not the umbrella term EPA.

hdamker commented 5 years ago

Required features need to be carefully selected -- we need to be adhere to decisions taken already in RM and RA1 and not make RA2 more convenient for lift&shift of PNFs directly into CNFs.

ASawwaf commented 4 years ago

@petorre , if i define it i will pull the definition from the intel site, so if you can put an extensive definition for EPA it will be great , can you support ?

I will start with: Enhanced Platform Awareness (EPA) represents a methodology and a related suite of changes across multiple layers of the orchestration stack targeting intelligent platform capability, configuration & capacity consumption. EPA features include Huge Pages support, NUMA topology awareness, CPU pinning, integration with OVS-DPDK, support for I/O Pass-through via SR-IOV, and many others.

CsatariGergely commented 4 years ago

@petorre , if i define it i will pull the definition from the intel site, so if you can put an extensive definition for EPA it will be great , can you support ?

I will start with: Enhanced Platform Awareness (EPA) represents a methodology and a related suite of changes across multiple layers of the orchestration stack targeting intelligent platform capability, configuration & capacity consumption. EPA features include Huge Pages support, NUMA topology awareness, CPU pinning, integration with OVS-DPDK, support for I/O Pass-through via SR-IOV, and many others.

This "many others" what we should close down in this definition.

ASawwaf commented 4 years ago

I can list what I know :)

-Platform Quality of Service (PQOS) -Advanced Encryption Standard New Instructions (AES-NI)

@CsatariGergely @tomkivlin is it enough :)

@petorre @trevgc your comment and contribution is appreciated here

hdamker commented 4 years ago

Hi,

I just want to remind why we are doing all this CNTT exercise: to reduce the complexity of onboarding VNFs/CNFs (here CNFs) on a common NFVI.

Especially to avoid the painpoint that VNF vendors (driven by the PNF history of the software) impose complex, arbitrary combination of capabilities on operator NFVI, fragmenting an operator infrastructure into silos.

We have defined profiles within the Reference Model (basic, network intensive, compute intensive) which are describing combinations of the above mentioned features. The availability of a lot of the above parameters is therefore already defined per profile. CNFs have to declare the needed profile per container.

That said, scheduling a VNF should not rely on EPA,, but only on the requested profile, e.g. by tagging resources.

In general very specific semiconductor features shouldn't bubble up through several layers of abstraction. Otherwise we look-in completely into a specific processor architecture (as much as we like x86 ... there are alternatives).

Herbert

hdamker commented 4 years ago

@ASawwaf: As a consequence of my last comment I propose to rename the issue into a discussion about the requirement "| req.kcm.03 | General | The Architecture must support scheduling of workloads based on Enhanced Platform Awareness (EPA) features such as CPU Pinning, huge-pages and SR-IOV. |" not about a definition of EPA.

My position is that we should not have this requirement but instead derive specific requirements which can be tracked back to the Reference Model.

Exactly as @CsatariGergely has written in his https://github.com/cntt-n/CNTT/pull/452#issuecomment-553760917:

We should indicate which NFVI Profiles/Instances do we target in this RA (I guess both three, but still) and we should derive the requirements for the RA from the RM requirements.

tomkivlin commented 4 years ago

My position is that we should not have this requirement but instead derive specific requirements which can be tracked back to the Reference Model.

@hdamker see #609 - that is exactly what that issue should result in.

tomkivlin commented 4 years ago

Personally, I don't know enough about EPA to say whether your definition is complete @ASawwaf. I would suggest however, that we need an authoritative reference point - i.e. for all the other terminology we've added to the glossary, we referenced the Kubernetes glossary. Is there is a similarly authoritative reference we can use for EPA? If not, then I worry about including it all, especially if we're not going to reference it within the requirements or the other chapters of the RA.

hdamker commented 4 years ago

My position is that we should not have this requirement but instead derive specific requirements which can be tracked back to the Reference Model.

@hdamker see #609 - that is exactly what that issue should result in.

Ohh, haven't seen that. And for this issue here a description would have helped to avoid misunderstanding - I thought it was about the requirement, Also labeling should be changed as it is not about RA 2, but about Glossary (part of RM I suppose).

tomkivlin commented 4 years ago

Also labeling should be changed as it is not about RA 2, but about Glossary (part of RM I suppose).

excellent point - I will update!

hdamker commented 4 years ago

Ok, then on the right topic: the proposal in https://github.com/cntt-n/CNTT/issues/592#issuecomment-555689269 reads for me like a marketing text, not a definition

ASawwaf commented 4 years ago

Personally, I don't know enough about EPA to say whether your definition is complete @ASawwaf. I would suggest however, that we need an authoritative reference point - i.e. for all the other terminology we've added to the glossary, we referenced the Kubernetes glossary. Is there is a similarly authoritative reference we can use for EPA? If not, then I worry about including it all, especially if we're not going to reference it within the requirements or the other chapters of the RA.

@tomkivlin , all of them from Intel and we used/Utilized some of them in our deployment, that why i tag @petorre @trevgc for their support to confirm ( and i can share the references )

petorre commented 4 years ago

For sure we should not refer to one "EPA" but rather be very specific of which functionality. Term "EPA" was never supposed to be a strict definition, but rather umbrella term introduced years back to describe various (mostly incremental, not all as some existed before Virtualization came to x86) features needed for deterministic placement of dataplane workloads.

This is applicable beyond RA2 so could better fit somewhere in RM: "Enhanced Platform Awareness" (EPA) represents a methodology to get workload placement understand underlying platform to deliver improved and deterministic application performance, and IO throughput. It enables fine-grained matching of workload requirements to platform capabilities, prior to launching a VM or Container.

Something more appropriate for RA2: Some of "EPA" features include multiple interfaces, Device Plugins (incl. SR-IOV), Huge Pages, NUMA topology, CPU pinning and Node Feature Discovery. RImplementations must encapsulate such features as platform's own intelligence. Where required by current technologies, RIs can expose only minimal clean API on What. RIs must not expose verbose HW API on How.

To further Simplify this and reduce integration/onboarding/LCM effort, another concept within CNTT we could try to agree on is to group EPA features together. Like that Network Intensive profile doesn't expose every individual feature but some more aggregate needed for dataplane and latency sensitive apps.

Please comment on the logic above and I can try to rewrite text to something better.

ASawwaf commented 4 years ago

@petorre , I agreed with you on the above, we have 2 points : 1- RM EPA definition, and I am ok with the above definition ( @petorre if it the final ) 2- For Sepefic RA/RI , let us group some of EPA features under profiles

@rabi-abdel @hdamker @tomkivlin R u ok in this approach to move forward?

hdamker commented 4 years ago

@ASawwaf I'm ok with the approach.

But please note that Chapter 4 of RM already defines "performance optimization capabilities" (see Table 4-2: Exposed Performance Optimisation Capabilities of NFVI) and maps them under the profiles (see Table Table 4-24: Mapping of NFVI Capabilities to Instance Types). I don't think the mapping there is already complete and perfect, but we should not start over in the RA/RI and define a completely new mapping. Instead we should change the RM accordingly if necessary and keep requirements in RA trackable to the RM (which is currently not the case)

ASawwaf commented 4 years ago

@hdamker , Totally agree, let work to close this issue this week

and afterwards, let have a new issue cover RM-Chapter 4 to reflect it @markshostak

@rabi-abdel @hdamker @tomkivlin @petorre we need to close this issue this week

rabi-abdel commented 4 years ago

That is ok for me @ASawwaf.

hdamker commented 4 years ago

@ASawwaf: Is there already a pull request for this issue which we can review?

petorre commented 4 years ago

1- RM EPA definition, and I am ok with the above definition ( @petorre if it the final )

Yes, can be final text for this issue, and if needed discuss more and change in PR.

trevgc commented 4 years ago

For the glossary we may want to include Hardware Platform Awareness (HPA) defined by ONAP in conjunction with ETSI. There have been been questions about this in CNTT and likely will come up more so we need a good explanation.

https://wiki.onap.org/display/DW/Hardware+Platform+Enablement+In+ONAP https://nfvwiki.etsi.org/index.php?title=Hardware_Platform_Capability_Registry "... enablement of hardware platform feature awareness (HPA) inside the ONAP management platform, or means by which knowledge about underlying compute hardware platform capabilities is exposed to VNFs running on top of the platform in order to optimize, accelerate and/or otherwise augment their execution"

trevgc commented 4 years ago

This is a good Intel reference source for EPA in Kubernetes and OpenStack https://networkbuilders.intel.com/network-technologies/enhancedplatformawareness

ASawwaf commented 4 years ago

@ASawwaf: Is there already a pull request for this issue which we can review?

@hdamker , I will create it and put the agreed definition over their

ASawwaf commented 4 years ago

@ASawwaf: Is there already a pull request for this issue which we can review?

@hdamker , I created it , Enhanced Platform Awareness" (EPA) definition

699

hdamker commented 4 years ago

Reopen as #699 is not yet merged and there is even a chance that it not get merged at all. Than we have to close this issue here with the correct cause.

See also my comment in PR #699 https://github.com/cntt-n/CNTT/pull/699#issuecomment-559425622

ASawwaf commented 4 years ago

@hdamker , thanks for opening, as I tried but I failed :)

CsatariGergely commented 4 years ago

I'm a bit lost here.

In #699 we agreed, to not have EPA in the glossary and use requirements to different technologies in the RM if we need them. In the RM we have the following EPA related requirements listed:

Do we need anything else?

tomkivlin commented 4 years ago

@CsatariGergely I don't think we do. I think this issue should be closed.

hdamker commented 4 years ago

@rabiabdel If you agree with comments above from @CsatariGergely and @tomkivlin you can close this undead issue.

kedmison commented 4 years ago

Discussed in RM call 2020-03-11 and agreed to close.