anuket-project / anuket-specifications

Anuket specifications
https://docs.anuket.io
123 stars 118 forks source link

[RM/RA/RI/RC] Document Hardware Objectives, Guidelines and Approach #590

Closed markshostak closed 4 years ago

markshostak commented 4 years ago

Issue: As discussed in multiple forums, CNTT currently specifies hardware in multiple documents (e.g., RM, RA and RI), making alignment challenging. Further, when RI/CIRV (or CSPs) are procuring hardware, there needs to be an objective mechanism to ensure the candidate configuration(s) will support the CNTT designs.

Objectives/Drivers: Drivers are the key to this process, and will be considered carefully. Please add your proposed objectives.

Approach: The following has been discussed independently in various forums. The intent here, is to bridge all of the forums to achieve a CNTT-level consensus. This is just a starting point to drive the discussion. It is expected to be modified/enhanced (i.e. this is not edict, it's catalyst).

As previously discussed, the current thinking is to:

  1. Identify CNTT's objectives/drivers for specifying h/w in the first place (i.e. what are we trying to achieve by detailing the h/w)
  2. Derive a set of clear, concise and usable Guidelines from the Objectives
  3. Identify where the Guidelines should live and PR them to those locations
  4. Gain consensus on a series of related questions (see below)

Related Questions: These questions have come up in RI/CIRV activities. They may be fully CSP discretionary, a hard CNTT requirement or somewhere in between. Irrespective, a supporting the reason, justification or rationale should be identified for each conclusion. In no particular order:
A. What/can node types/functions can be combined in production (and why)

B. How to best accommodate hetero and homogeneous server farms C. Should CNTT recommend interface speed guidelines? (if so, what are they?) D. Should CNTT recommend server farm dimensioning guidelines? (if so, what are they?) E. RC test bed requirements

F. Insert additional questions here

This is a large Issue, and it is expected to spawn actionable supporting issues that can be then be assigned/adopted by individuals.

rabi-abdel commented 4 years ago

@markshostak My suggestion is the following:

  1. Document the answers in the RM Chapter 1.
    • between 1.4 and 1.5 (we can call it: Approach)
    • one issue/PR is needed for this.
    • Let us create the PR and start the review process.

Quick Answers to some questions from me (to be discussed and agreed on while reviewing the PR)

I hope that gives initial direction. let us discuss more in TSC and more importantly when we create the PR.

markshostak commented 4 years ago

UNH has proposed the following configuration for the UNH IOL LaaS lab for 2019 procurement. This is an agenda item for the RM call, as CNTT needs to be aligned in time for UNH to take delivery of this h/w in 2019. Each server shall meet the following minimum specifications:

CPU:

Memory:

Storage:

Network Interfaces (note 1)

Note 1: At least 1 network interface must be capable of performing PXE boot and that network must be available to both the Jump / Test Host and each bare-metal server.

TFredberg commented 4 years ago

In regards to the RM part of the conversation on 590 I have a couple of comments to former posts:

@rabi-abdel

Maybe these comments do not belong in #590, but they are at least examples of what of #590 that I would consider important to have in the RM (in the abstracted way):

As a comment to the RM meeting today (20 Nov) about support and focus of Heterogenous computer systems, I think that Heterogenous systems must be supported in a proper way both from functionality and dimensioning point of view to allow gradual and sliding HW upgrade schemes.

kedmison commented 4 years ago

@TFredberg wrote:

Speed in some form of abstracted way

  • e.g. expressed as equivalence to suitable set of benchmarks (like SpecINT)
    • Preferably not GHz that is very ISA, generation and implementation specific

The challenge with this is that benchmarks like specINT and specFP are generally collections of tests that are executed, each with their own result, and that a single number is often being used to try to simplify the view of these different performance results. This chart, https://images.anandtech.com/doci/15009/Tremont%20-%20Stephen%20Robinson%20-%20Linley%20-%20Final-page-012.jpg, showing relative performance of a new Tremont architecture relative to Goldmont Plus. While these Atom-class chips may not be completely relevant, the process of comparing them is, and this chart shows a line drawn at about 32% uplift from Goldmont Plus to Tremont... but several individual tests falling well below or significantly exceeding that 32% number. Depending on which of the individual tests a VNF workload most looks like, it may experience a performance boost that differs widely from the 32% average uplift.

This is relevant to CNTT because an attempt to use a generic measure like SPEC or other performance number may run into the same situation; being wildly inaccurate for some types of workloads.

in essence, this is my rationale for wanting to add

to the compute portions of the profile specification.

TFredberg commented 4 years ago

@kedmison I agree to that SpecINT and other benchmarks are not 100% accurate, and welcome if someone knows about more representative abstracted ways for a set of applications (VNFs).

The core speed is not very accurate either specifically if the core architecture (generation) is not given. It is counterproductive for cloud deployment where HW/SW decoupling is important, to have to match an application into a specific core architecture and frequency. This would require applications to test with all possible core architectures and possibly also at different core frequencies and deliver some performance graphs from that. Still there are other potentially disturbing elements in cloud deployments where the socket is shared e.g. by noisy neighbors that might disturb the caches, IO and host scheduling.

I also agree that 1 value is not giving enough information, but that is just the same problem with giving the frequency of the cores. Specifically now with the modern sockets having features like Intel Speed Select (from Cascade Lake) and the rather old Turbo functionality. Speed Select enables the user (that is in control of these registers) to select how many cores that should be used and what base frequency each individual core should be enabled with. These are all great features but with a single speed figure (in whatever format) no orchestrator can make use it.

I guess we agree on that 1 value is not enough.

Are there someone that have some good suggestions of some sets of representative benchmarks valid for our industry?

trevgc commented 4 years ago

Not only one value but one workload or method may not be enough. We might consider various "performance profiles" using different benchmarks. e.g. a candidate for network intensive is ETSI GS NFV-TST 009 - "Specification of Networking Benchmarks and Measurement Methods for NFVI" https://www.etsi.org/deliver/etsi_gs/NFV-TST/001_099/009/03.02.01_60/gs_nfv-tst009v030201p.pdf We are working on a PoC to explore this approach in the Intel OPNFV Community Lab ... i.e. deploying an example reference implementation on a variety of Xeon-SP and Xeon-D platforms and measuring performance using open source tools deployed in VMs. While starting with network intensive we would also like to explore compute and storage next.

TFredberg commented 4 years ago

@trevgc This form of benchmark tests seems to be on the right path for compute and communication HW services, but they are in these specifications expressed as complete NVFi benchmarks with the VM or Container hosts included in the DUT/SUT.

What we are after here would more be that the set of different "benchmark application" also include the behavior and needs of the SW virtualization layer and only measure the goodness of the pure HW infrastructure, but in the same types of characteristics i.e. Throughput, Latency, Delay Variation and Loss maybe with some additional characteristics added.

The only other characteristic that I can envisage today, in regards to compute and communication, that I can not see modelled in the above characteristics would possibly be the benchmark applications internal "built up knowledge" that enable it to perform its task with some improved characteristics or quality measure. This would likely grow in importance with increased level of included ML/AI functionality.

We would also have to add storage access oriented benchmarks characteristics e.g. IOPS@R/W(ratio), Read-Latency, Write-Latency, etc. on some application level like Blocks, Files and Objects possibly also amended with characteristics for systems like Resiliently Stored where that is required.

For all the above usages we should probably over time add the characteristics Energy Consumption per Throughput (and built up knowledge + storage).

With a number of well selected "benchmark applications", likely dependent on the RA in which they are deployed, and a number of different "loading points" or "traffic mixes" I would think it would be possible to also cover included HW accelerators, since after all they are only there to accelerate some of these characteristics.

Then I realize we are not there today and these benchmarks might not exist in the wanted form, but we need to start breaking up the prescriptive SW definition of the HW as being a very detailed prescription of what components it have to have and how they should be interconnected. Without this we will never get an efficient enough HW with relevant acceleration.

I also fully agree with @kedmison that a single figure comparison will leave the floor open for a simplistic evaluations of HW purchase price without proper considerations into other values that also needs to be evaluated.