ietf-wg-idr / draft-ietf-idr-bgp-car

0 stars 0 forks source link

Review by Nats - on Adoption call issues (Part 1) #17

Open suehares opened 10 months ago

suehares commented 10 months ago

Link for email discussion https://mailarchive.ietf.org/arch/msg/idr/202uy9M61gIN3tJdMZNLmKEQSYI/

The following review is for

with respect to the intent-driven service mapping problem and how it addresses provider network deployment scenarios that are seen today, while mainly focusing on the issues raised in IDR GitHub by the CT team.

Having spent considerable amount of time for this review, I hope it helps operators/vendors to,

Verdict: Has Issues

suehares commented 10 months ago

part 2 - of discussion of https://mailarchive.ietf.org/arch/msg/idr/202uy9M61gIN3tJdMZNLmKEQSYI/

Setting up Key Aspects:

Quoting from RFC 1925: Twelve Networking Truths:

(4) Some things in life can never be fully appreciated nor understood unless experienced firsthand. Some things in networking can never be fully understood by someone who neither builds commercial networking equipment nor runs an operational network.

Before we begin, Intent-driven service mapping using BGP is a substantial effort. It applies to "all" BGP Service Families whose nexthop is reachable via transport tunnels. These tunnels belong to various forwarding architectures such as MPLS (RSVP-TE, SRTE), IPv6 (SRv6), IPinIP or MPLSoverUDP. It introduces new constructs that allow for an operator to be able to specify the service intent. Therefore, this is a substantial upgrade for existing provider network deployment scenarios such as MPLS-VPN Options, CsC, Multi-Homed CE and Anycast scenarios that are prevalent today. Therefore, this review will evaluate BGP-CAR based on certain key aspects and operational criteria.

Intent-Driven Service Mapping Key Aspects:

To better understand the nature of issues raised, it is important to consider the basic intent-driven service mapping aspects.

A) Classification & Grouping: Construct to group underlay tunnels with sufficiently similar TE characteristics

DR# Color is the construct for an intent, already well-defined as per RFC 9256 - described in the introduction and main sections.

Hares RFC9256 references the Color-EC is defined in RFC9012. The color usage is defined in RFC9256. Color is well-defined for SR (Spring) usage in RFC9256. The definitions in draft-ietf-idr-bgp-car specify expanding this to the car routing solution. Stating it is "well-defined" needs to be clarified to "well-defined for the SR routing solution".

B) Resolution & Steering: Construct for overlay routes with a Mapping Community e.g. Color-EC, to resolve next-hop reachability to the underlay path with the same "effective color" arrived from the following

Hares Sections 2.5, 2.10, and Appendix B deal with the different types of color. As noted in the shepherd's report, enhancements for Sections 2.5 and Section 2.10 were suggested for -06. Further specific suggestions for improvement for these sections can be made.

C) BGP Extensions: (AFI/SAFI Encoding/Decoding) A BGP family to extend the above constructs to adjacent domains (AS/IGP)

Hares Sections 2, 7, 8, 9, and Appendix B deal with the different types of color. As noted in the shepherd's report, enhancements for sections 2, 7, 8, and 9 were suggested for -06. Further specific suggestions for improvement for these sections can be made based on -06. This simply seems to be a list of changes defined by the solution.

D) Path Availability & Selection: This is the final and most important aspect which has two important steps for selecting transport routes for nexthop reachability:

i) Avoid path hiding - CAR route: Type-1: Unique "IP: Color" key in NLRI Type-2: Unique "Color IP Prefix" per color-locator

ii) Path selection and Resolution should use the same "effective color" CAR route Type-1:
Path Selection key Prefix: IP and Type-1-NLRI-Color Resolution key Prefix: IP and "Effective Color" (Type-1-NLRI-Color or Color-EC on CAR route or LCM-EC on CAR route)

CAR route Type-2:
     Path Selection key Prefix: IP color prefix
     Resolution key Prefix: Not needed as it is routed
suehares commented 10 months ago

Part 4 of discussion of https://mailarchive.ietf.org/arch/msg/idr/202uy9M61gIN3tJdMZNLmKEQSYI/

F3-CAR-Q1: Status = Unresolved

BGP-CAR Appendix A.7 – Anycast EP Scenario: https://mailarchive.ietf.org/arch/msg/idr/nAj25sX0x_lp09VEqUDSCxmDR_w/

Note: This is not a solution but just a workaround. While bug fixes in code are common, it is painful to see bug fixes and work-arounds in new and evolutionary technology drafts.

F3-CAR-Issue-1.1: To work around Issue-1.2 and Issue-1.3, each anycast end-point is configured with a different CAR-NLRI-Type-1 'Color' to avoid path hiding and attaching the same Color-EC to all EPs additionally.

DR: Color in the NLRI represents the route intent. The use of Color-EC provides an automated mechanism to resolve the next-hop and takes precedence over route color as the text you’re quoting clearly states.

Hares: The -06 text should be examined to determine if change provide the clarity in the procedure description on: a) next-hop resolution with Color-EC, and b) precedence in resolution (Color-EC, LCM-EC, and color in NLRI (if type-1)).

F3-CAR-Issue-1.2: Considering the second quote in procedure III. and an Administrative domain with ColorSRTE based transport tunnels, a CAR route can resolve over a Color SRTE route with TEA. This means that for N anycast EPs, there are N colors assigned and hence N color SRTE paths per AS domain for each Color-EC. This multiplies the Color SRTE forwarding state by a factor of N. This needs to be called out in the draft.

DR: All the N anycast routes can resolve via the same SR-TE color, specified in the color-EC. This illustrates the benefits of the decoupling and indirection between route and next-hop colors.

Hares: Please cite the section that describes the procedures for resolution in -06.

F3-CAR-Issue-1.3:
Quoting from section A.7 "Both E2 (in egress domain 2) and E3 (in egress domain 3) advertise Anycast (shared) IP (IP1, C1) with same label L1"

DR: Not sure what is specific to CAR SAFI here. Static MPLS labels / SR prefix-SIDs / anycast SIDs are all existing well-known features.

Hares: Do you state a requirement for static labels in -06 of the draft? I find static labels discussed in section 5.5.2. I find the word label is sections 2.9.2.1 (Label TLV), 2.9.2.2 (Label index), section 5.3, section 8, section A.3.2, C.1 and C.2 (no labels). Would you let me know which section covers this issue? Hares:: Issue needs to be resolved prior to WG LC.

F3-CAR-Issue-1.4: How does ECMP and protection work here?

DR:Section 2.7 describes the key is (E,C), so multipathing is automatic, does not require additional “import” as in CT.

Hares: This is described for BGP CAR SAFI Type 1 Route (E ,C) in section 2.7 in version -06. The multipathing in section 2.7 refers to advertisement within a single AS between two Area Border Routers (ABR). Is the intent of this section a single AS? Or do you need to apply this to multiple AS in a path between egress AS and ingress AS.

Hares: Should be discussed prior to WG LC

DR: Appendix A.7 provides examples of two choices. Not sure what is ambiguous. [The] choice of color and control for specific intent use-cases is in the hands of the operator. Hares: Question for Nats: Is section A.7 clear? If not, what is unclear. If A.7 is clear, do you have a procedure in the main text that explains A.7?

F3-CAR-Issue-1.5: Section B.2 - N:M distribution

In Figure 12: N:M illustration, attaches at least two Color-ECs on the CAR route while overloading the CAR-NLRI-Type-1 'color' as both distinguisher as well as intent for unicast EPs. There are no procedures as to which Color-EC should take precedence in which AS/ABR domain. This needs to be clarified. It should also note that there needs to be one additional Color-EC on the CAR route for each M (M1, M2, ...).

Hares: I want to verify that this example follows the description from 1/29/2024 presentation: a) Color-EC is used to resolve NextHop and b) NLRI Color (E,C) is used to indicate the intent of the traffic. There is no LCM-EC in this example. Hares: - Example in B.2 needs to be discussed.

DR:The example clearly illustrates how the entire e2e resolution can be automated by leveraging the power of hierarchy and the standard and widely implemented/deployed Color-EC mechanism from RFC 9256.

Hares: A discussion of how your comments on this example match your 1/29/2024 presentation would help the shepherd.

F3-CAR-Issue-1.6: Section B.2 - N:M distribution and Anycast In Figure 12: N:M illustration, now with Anycast in play and as per F3-CAR-Issue-1.1, there will be atleast "three" Color-ECs, where Color-EC-Anycast represents Anycast plus the "two" Color-ECs for the N:M distribution. As per A.7, CAR-NLRI-Type-1 'color' now becomes the distinguisher and Color-EC-Anycast becomes the intent. This is a "critical" issue because there is no clear precedence in picking the Color-EC assigned for Anycast. This needs to be clarified as part of the draft.

Hares: Version -06 does add text to clarify the precedence between colors (LCM-EC, Color-EC, and (E,C)). However, -06 version only has 2 colors (Color-EC and (E,C)) in diagram B.2. A review of the

DR: The Color-EC assignments need not change if the route is for an Anycast or Egress Domain Prefix. So there is no issue above.

Hares: Are you stating that in the example in Appendix B.2, that the routes (E,C) remain the same? This means (E2,100), (E2,200), (E2,300), and (E2,400) remain the same across the domain? The next hops are changing for this route (E2-> 231 -> 121)? Does -06 additional text with precedence help this example?

F3-CAR-Q2: Status = Unresolved BGP-CAR – Consensus on the need for resolution-schemes https://mailarchive.ietf.org/arch/msg/idr/g6ZCJYzWwgRsilWlZY74MTv9Fqk/

F3-CAR-Issue-2.1: Based on the above issue list, especially Issue-1.1, Issue-1.4, F3-CAR-Issue-1.1, 1.4, 1.5 and 1.6, it is indeed important for CAR to have a clear scheme for resolution that handles all the of these issues.

DR: None of the above comments raised any specific issue that were not discussed previously or has not been addressed in the draft. Hope my responses have provided clarity. If some point is not clear, please let me know.

Hares: Which sections deal with route resolution in the draft? This sections can be reviewed specifically in version -06.

F3-CAR-Q3: Status = Unresolved CAR-Q3 - Handling [of] LCM and Color Extended Communities https://mailarchive.ietf.org/arch/msg/idr/w5ROKVQPtVcI_BTBXfnKpKB4h4k/

F3-CAR-Issue-3.1: Considering quotes from section 2.10 in Procedure (III.), F3-CAR-Issue-1.1 to 1.6 apply. Additionally, LCM-EC needs to be factored into F3-CAR-Issue-2.1.

F3-CAR-Issue-3.2: Considering the following quotes from sections 2.5, 2.9.4, and 2.10 in Procedure (III.) respectively,

These normative text contradict each other. If Color-EC takes precedence over CAR-NLRI-Type-1 'color', then Color-EC becomes the "intent" and CAR-NLRI-Type-1 'color' becomes purely the distinguisher and not overloaded with intent. In such cases, it is not very clear whether Color-EC value should be the effective intent instead of NLRI color and which one should be copied into LCM-EC since, CAR-NLRI-Type-1 'color' is just a distinguisher.

This needs to be clarified as part of the draft text.

DR: The text clearly states the above order of precedence is for CAR route resolution. Color-EC when present represents the intent/color used for resolving the BGP CAR next-hop, following semantics already established in RFC 9256. If not present, resolution uses the route color (whether its’ in NLRI or LCM). Let’s try to avoid cherry picking text out of context and mixing them.

Hares: Section 2.5 in version 06 states: "Local policy takes precedence over color-based automated resolution. For a CAR route, Color-EC color takes precedence over route NLRI color. When LCM-EC is present, Color-EC color takes precedence over LCM-EC color." Does this clear up the precedences?

Hares: Query needs to be made to Nats.

F3-CAR-Q4: Status = Unresolved BGP-CAR – Mis-Routing in Non-agreeing color-domains for Anycast EPs https://mailarchive.ietf.org/arch/msg/idr/OOZOBSyjdAYBar8NxvOqo6-5fAc/

F3-CAR-Issue-4.1: Quoting from section 2.9.4 (version -03) 

 "If two BGP paths for a route have different LCM values,
  it is considered an error and the route is not
 considered for bestpath selection."

The above text still does not address the misrouting problems that might arise in Anycast with Non-Agreeing color domains as indicated in the above issue link.

However, this seems to have been "worked around" using Color-EC as the intent, CAR-NLRI-Type-1 'color' now acting solely as a distinguisher and hence its derived LCM-CE also acts as a distinguisher in the receiving non-agreeing domain. The local policy operations for this scenario needs to be clearly specified to avoid ambiguity in the draft. It is also important to note that a portion of colors from "intent namespace" is being used purely as distinguishers and others being overloaded with intent as well. This needs to be clarified in the draft as to how the intent namespace(s) are managed so that local policy schemes can derive how this namespace is used.

DR: I’ve already addressed this above and previously.

" When color-aware routes propagate across a color domain boundary, there is typically no need for coordinating color assignments, since the IP prefix is unique in the transport network, and hence makes the color scope also unique and non-conflicting. The color only needs to be re-mapped into a local color assigned for the same intent (which is carried in the LCM-EC)."

It would be useful to point to an example with LCM-EC changes for Type-1 and Type-2. Also, it would be good to have the example of Color-EC (Appendix B.2) mentioned as an example of Color-EC resolution.

F3-CAR-Q5: Status [is] Resolved Update Packing Observations are considered resolved. No additional text is needed.

DR: Thank you.

suehares commented 10 months ago

Part 3 of discussion of https://mailarchive.ietf.org/arch/msg/idr/202uy9M61gIN3tJdMZNLmKEQSYI/

Review [By Nats] (follows comment 2)

Switching back to the open issues and scenarios, this review will focus on evaluating CAR on the above set of aspects and criteria.

MAJOR: "Where is my color?"

DR: The draft clearly specifies where color is derived from and for what purposes. Specific comments are addressed below.

Hares: As pointed out by my Shepherd's report on 1/26/2024, the text regarding the usage of LCM-EC and Color-EC needed upgrading. A discussion of this point after -06 to see if the improvements have addressed the issue is warranted.

These are arrived from the following procedures:

PROCEDURE I: CAR Route validation

Quoting from section 2.4: BGP CAR Route Validation _[version-03]_ 
  "A BGP CAR path (E, C) from N with encapsulation T is valid if color-
   aware path (N, C) exists with encapsulation T available in dataplane."

  " A local policy may customize the validation process:
   *  the color constraint in the first check may be relaxed: instead N
      is reachable via alternate color(s) or in the default routing
      table"
      ...
  "A path that is not valid MUST NOT be considered for BGP best path
   selection."

Hares: Feedback is needed from Nats on specific improvements to -06 text for 2.4.

PROCEDURE II: Path Selection and Availability

Quoting from section 2.7: Path Availability

  "The (E, C) route inherently provides availability of redundant paths
   at every hop, identical to BGP-LU or BGP IP."

Hares: Feedback is needed from NATS on what is unclear.

PROCEDURE III: CAR Route resolution

Quoting from section 2.5: BGP CAR Route Resolution

  "For a CAR route, Color-EC color takes precedence over route NLRI color."

  "A BGP CAR route may recursively resolve over a BGP route carrying
   Tunnel Encapsulation attribute. Procedures of section 8 of [RFC9012]
   apply in presence of BGP Color EC in the CAR route. They are
   extended to use LCM EC and Color in CAR route NLRI as per above and
   Section 2.9.4 in absence of BGP Color EC."

Quoting from section 2.9.4: LCM-EC "When a CAR route crosses the originator's color domain boundary, LCM- EC is added."

  "An implementation SHOULD NOT send more than one instance of the LCM-
   EC.  However, if more than one instance is received, an
   implementation MUST disregard all instances other than the one with
   the numerically highest value"

  "If present, LCM-EC is the effective intent of a BGP CAR route."

  "LCM-EC Color is used instead of the Color in CAR route NLRI for
   procedures described in earlier sections such as route validation,
   resolution, AIGP calculation and steering."

Quoting from section 2.10: LCM and BGP Color Extended Community usage "Default order of processing for resolution in presence of LCM-EC is local policy, then BGP Color-EC color, and finally LCM-EC color"

This CAR route will have one or more of the following depending on the NLRI type and the Scenario being addressed.

Item | CAR color variable | Scenario "=====================================" | A. | Color-EC color | Anycast, Single Domain N:M color | B. | LCM-EC color | Multiple color domains | C. | CAR-NLRI-Type-1 'color' | Default, used for path availability and selection additionally | D. | CAR-NLRI-Type-2 'IP Prefix' | Defined in Section 2.9.3, 8, 9 and 10.2 "=====================================================" item | CAR color variable | Scenario** "=====================================================" | A. | Color-EC color | Anycast, Single Domain N:M color
| B. | LCM-EC color | Multiple color domains
| C. | CAR-NLRI-Type-1 'color' | Default, used for path availability and selection additionally | D. | CAR-NLRI-Type-2 'IP Prefix' | Defined in Section 2.9.3, 8, 9 and 10.2 "====================================================="

Hares As noted in the shepherd's review, clarifications on the processing and handling of LCM-EC and Color-EC are necessary for the -05 text. A walk-through of the -06 should be made to determine if any further work should be done.

NATs: Quoting from RFC 1925: Twelve Networking Truths:

(3) With sufficient thrust, pigs fly just fine. However, this is not necessarily a good idea. It is hard to be sure where they are going to land, and it could be dangerous sitting under them as they fly overhead.

Issue-1.1: As mentioned in key aspects, the path selection key and the resolution key can be different depending on the scenario. This will cause misrouting issues such as the ones mentioned in F3-CAR-Q4 below. [Some] specific text needs to be added to address [this] scenario.

DR: There is no practical misrouting issue as has been extensively discussed during the WG adoption call. See the summary of a substantive discussion between Bruno and Jeff. https://mailarchive.ietf.org/arch/msg/idr/ZKcFdSSn02hFTqe6p4pNOgNjDYY/

DR: As discussed there and recommended by Jeff, we’ve added relevant text in the Operational / Manageability considerations [in] section 11.

Issue-1.2: CAR-NLRI-Type-1 'color' is overloaded for path selection, path availability as well as intent. While this solves path hiding for Unicast EPs, there is path hiding @ ABR/ASBRs in anycast and multihomed EP scenarios, and as a result, all paths are not visible to the ingress domain and thus end-to-end, as both Prefix and Color are [the] same in all paths. Therefore, overloading CAR-NLRI-Type-1 'color' as both the distinguisher (RD) as well as "Intent" needs to be explicitly called out for each scenario.

DR reply

Issue 1.3: Migration from unicast to anycast is not seamless for E2E path availability use-cases.

DR reply

Issue 1.4: Colorless Transport Tunnels with varied intents like RSVP-TE, MPLSoUDP, and IPinIP are quite prevalent in existing brownfield [networks] and customers do foresee these deployments continuing into the next decade as well.

DR: Color is indeed the construct for resolution, augmented by local policy to map as needed to traditional, color-unaware mechanisms. Please see Section 2.5.

suehares commented 9 months ago

Shepherd has yet to review this for editorial issues.
Technical issues are considered closed.