anuket-project / anuket-specifications

Anuket specifications
https://docs.anuket.io
123 stars 117 forks source link

[RM] Networking Strategy #960

Closed rabi-abdel closed 4 years ago

rabi-abdel commented 4 years ago
pgoyal01 commented 4 years ago

The underlay networking is common to both RA-1 and RA-2 but not sure about the overlay.

markshostak commented 4 years ago

@rabi-abdel Hi Rabi, The high-level perspective as discussed in Prague is we need to be able to interoperate with a variety of fabric solutions, as there are many Operator-specific details involved in a fabric, where some Ops may have needs demanding the advantages of a complex Layer-3 fabric (and have the staff to run it), some Ops may only need a simple Layer-2 fabric, and other Ops' needs fall in between. Hence, the intent is not to have a dedicated RA-x, specific to fabric/networking, but rather to have a solution that will complement any RA.

are we going to have networking details spread out between different RAs?

In keeping with the methodology employed in CNTT for VNF Abstraction and other areas, the thinking is to describe what portions need to be incorporated in the RM, and what portions need to be in the RAs. As well as to provide a working fabric in the RI, without dictating the exact composition of a production fabric. This way an Op can design or procure a compatible fabric that meets their needs. Note, fabric in this context is inclusive of the fabric underlay, and the networking overlay. At the end of the day, tenants need to see isolated L3 networks (with some noted exceptions), and they need to be manageable by CNTT Infra.

Based on conversations in Prague and some sidebars we had with you, Tomas ( @TFredberg ) and I are planning to create a "Generic Fabric Model", currently RM-Ch04 Section 4.3. The intent is to capture the generic requirements, and divide them across the RM, RA and RI domains. As always, things that are common will be in the RM, while things that are RA-specific will be in the relevant RA. The RI is presently viewed as needing to provide a compatible fabric solution sufficient to support RC, but not currently intended to be used in production.

Another point of note, Tomas has some ideas (and industry contacts) on how CNTT may be able to work w/ SmartNIC vendors to align on a cloud-native, abstraction-friendly de facto industry standard ABI, to resolve the hardware-specific driver in the workload issue, which may also be transferable to SRIOV to some degree (stretch goal). Milestones in this area could be captured in a CNTT Non-Confirming Technology policy.

Ahmed (@ASawwaf ) has also been involved in fabric discussions prior to Prague, so Ahmed and Tomas are key CNTT players, and can hopefully drive significant contributions in this area. Please attend the weekly RM meetings (Wednesdays at 14:00 UTC) for working discussions.

do we need a new networking work-stream?

The plan from Prague is to start this work this coming week (week of 27/1) in the RM WS domain. If another, dedicated WS is required, can be evaluated in that forum and forked from there.

Tomas and Ahmed, please comment if I missed anything, and Rabi, feel free to ask questions.

If anyone else has input, feel free to chime in and/or attend the RM calls.

Thanks, -Mark

oyayan commented 4 years ago

@markshostak Hi Mark, doesn't this mean duplicating almost same work between the current documents if we go with fabric/PR fabric? As mentioned in the above comments physical underlay for the fabric is standard for both RA1-2.

CsatariGergely commented 4 years ago

I agree, that this should be started form the RM, and not as a new WS. Implementation of new orchestration frameworks for networks is out of scope for CNTT.

rabi-abdel commented 4 years ago

The discussion will be taken place in the RM meeting. please join if you would like to get involved. No new WS is proposed to be created at this time.

rabi-abdel commented 4 years ago

Assigned to RM to take the first step at this.

ASawwaf commented 4 years ago

@markshostak , thanks , for leading today discussion as usual, , we agreed to have more discussion on the programmability fabric, and this should be started from RM

For MoM , https://wiki.lfnetworking.org/display/LN/2020-01-29+-+%5BCNTT+-+RM+Workstream+Master%5D+Agenda+and+Meeting+Minutes

karinesevilla commented 4 years ago

For me, It 's better to address networking in RM to cover the networking aspects common to RA1 and RA2 rather than creating a new workstream

pgoyal01 commented 4 years ago

Isn't CLOS the general network fabric model? If so, do we really need meetings for that or just the development of content?

ASawwaf commented 4 years ago

CLOS more or less is IP Fabric , the point that raised by Mark regarding programmable fabric which is closed as per @rabi-abdel in gov call , but frankly, I didnt know the result

@rabi-abdel can you advice ?

sukhdevkapur commented 4 years ago

Because of the meeting timing issues, I probably will not be able to join the calls to discuss, but, I want to make a point regarding programmable networking fabric. Please ensure that do not include (or make reference to) to specific implementations. Please address it at an architectural level - such as APIs based upon P4 interfaces (i.e. what is generic and what is specific), splitting of the management, control/forwarding plane, etc. This should be done in a way so that the architecture is generic and vendors should have flexibility to select specific implementations.

ASawwaf commented 4 years ago

As per RM call ( 5th of Feb), we need to identify which chapter that we include Networking requirement as short term requirement to be covered in baldy release So actions are : which chapter in RM and what is content/requirements?

can you help @sukhdevkapur @oyayan @walterkozlowski @karinesevilla

walterkozlowski commented 4 years ago

@ASawwaf, @oyayan, @karinesevilla In my view for Baldy we should use RM 3.2.4 Network (within Chapter 3 Modelling). The contents should define the scope of networking within the RM and in general in CNTT. I still believe that to do this we need to have a high level model defining major entities(like Fabric and SDN) and major relations between these entities. In that way, we will be able to clearly articulate what will be in the CNTT scope and what will not be in scope. Btw: I am happy to contribute to this content.

ASawwaf commented 4 years ago

@walterkozlowski , thank Walter , agreed that RM 3.2.4 Network (within Chapter 3 Modelling) should be a good place

it will great if you can start to some content

thx

kedmison commented 4 years ago

Decisions as per RM meeting 2020-02-26: