aceelindem / ietf-bgp-strict-mode

Git repository devoted to the development of draft-merciaz-idr-bgp-bfd-strict-mode
0 stars 2 forks source link

Add text dealing with BFD hold-down #3

Closed jhaas-pfrc closed 1 year ago

jhaas-pfrc commented 1 year ago

Attempt to cover bfd holddown points.

albertf1 commented 1 year ago

Hi Jeff,

Sorry, for some reason I was not able to click the git hub link at the Bloomberg offce today, so I am responding from home.

I have one question about the changes, on line 243: "It is RECOMMENDED that the BFD hold-down intervals used with BFD strict-mode, when configured, use similar values."

What does this sentence mean?

Also, Section 4 of our current draft mentions this statement: "The proposed default BGP BFD Hold time value is 30 seconds. The BGP BFD Hold time value is configurable."

Why do we have to have a default BGP BFD Hold time value? If BFD is enabled, it will by definition have a BFD Hold Time value??

Also, can you please include my address in the draft:

Albert Fu Bloomberg LP 731 Lexington Avenue New York NY 10022

Thanks Albert

On Mon, Jul 10, 2023 at 1:39 PM jhaas-pfrc @.***> wrote:

@jhaas-pfrc https://github.com/jhaas-pfrc requested your review on: #3 https://github.com/aceelindem/ietf-bgp-strict-mode/pull/3 Add text dealing with BFD hold-down.

— Reply to this email directly, view it on GitHub https://github.com/aceelindem/ietf-bgp-strict-mode/pull/3#event-9777764094, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFFPJGSXM43IRSNKZUXGUV3XPQ46PANCNFSM6AAAAAA2E3TSCM . You are receiving this because your review was requested.Message ID: @.***>

jhaas-pfrc commented 1 year ago

Albert,

On Jul 10, 2023, at 11:16 PM, Albert Fu @.***> wrote:

Hi Jeff,

Sorry, for some reason I was not able to click the git hub link at the Bloomberg offce today, so I am responding from home.

I have one question about the changes, on line 243: "It is RECOMMENDED that the BFD hold-down intervals used with BFD strict-mode, when configured, use similar values."

What does this sentence mean?

If you think about the headaches of holddown, even in existing pre-strict implementations, part of it is when BGP goes to Established on each side. However, whether that matter depends on whose implementation of holddown we're talking about.

This draft has tried very hard to not be normative about holddown because it does vary per vendor.

BFD strict says "don't let us transition out of OpenConfirm until BFD is Up".

holddown says "BFD is up only after it's up and the holddown interval has concluded"

When we transition to the OpenConfirm state, we start the BGP hold timer.

Thus, bgp hold timer > effective bfd holddown interval.

That part is hopefully obvious. Where it's messy is "when does BFD start?". Juniper's implementation will start BFD before BGP gets kicked off rather than started when BGP moves out of the Idle into the Connect states. This is because our BFD is an asynchronous mechanism and has programming latency which may depend on system scale. So, if you considered your configured holddown time to be as if it was instantaneous, or withiin two seconds or so of when we transition to Connect, you'd want that holddown number to be smaller than the hold timer.

The less obvious part is that holddown configuration is NOT negotiated. The holdtimer is either negotiated, or defaults to 30 seconds for comparison purposes in our draft. So, if side A configures a hold-down interval << side B, side A can transition into the Established state because BFD was Up, the hold-down time for A was satisfied, but B is still waiting. B will then be waiting and not transitioning and then A will eventually expire its holdtimer.

The holddown timers have to be symmetric enough or within tolerances of the bgp holdtimer to do the job.

Did you have better text to recommend to cover this point?

Also, Section 4 of our current draft mentions this statement: "The proposed default BGP BFD Hold time value is 30 seconds. The BGP BFD Hold time value is configurable."

Why do we have to have a default BGP BFD Hold time value? If BFD is enabled, it will by definition have a BFD Hold Time value??

You skipped the prior sentence: " If the BFD session does not transition to the Up state, and the HoldTimer has been negotiated to a non-zero value, the BGP FSM will close the session appropriately."

This is a BGP hold time of last resort when BGP itself has a zero hold time. If we don't have this, we can linger in limbo forever.

Also, can you please include my address in the draft:

Albert Fu Bloomberg LP 731 Lexington Avenue New York NY 10022

Thanks Albert

On Mon, Jul 10, 2023 at 1:39 PM jhaas-pfrc @.***> wrote:

@jhaas-pfrc https://github.com/jhaas-pfrc requested your review on: #3 https://github.com/aceelindem/ietf-bgp-strict-mode/pull/3 Add text dealing with BFD hold-down.

— Reply to this email directly, view it on GitHub https://github.com/aceelindem/ietf-bgp-strict-mode/pull/3#event-9777764094, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFFPJGSXM43IRSNKZUXGUV3XPQ46PANCNFSM6AAAAAA2E3TSCM . You are receiving this because your review was requested.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/aceelindem/ietf-bgp-strict-mode/pull/3#issuecomment-1630040953, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABRM5QAVO5KUZWO2GDIB56TXPTAQJANCNFSM6AAAAAA2E3TSCM. You are receiving this because you were mentioned.

jhaas-pfrc commented 1 year ago

I pushed this into master on github.

On Jul 10, 2023, at 11:16 PM, Albert Fu @.***> wrote:

Also, can you please include my address in the draft:

Albert Fu Bloomberg LP 731 Lexington Avenue New York NY 10022

albertf1 commented 1 year ago

Hi Jeff,

Sorry, it's still a bit unclear to me what "similar values" in the sentence below refers to (i.e. similar to what??): "It is RECOMMENDED that the BFD hold-down intervals used with BFD strict-mode, when configured, use similar values."

I understand the need for BFD Holddown-Interval to be less than BGP Hold time, or the BGP session would timeout due to BFD Holddown-Interval being too long.

I am going to use some real values of how we currently deploy BGP BFD strict, albeit on the Juniper proprietary implementation. For this application, we have a lot of redundant circuits, so when the circuit flaps, we want BFD to be stable for 40s before we start putting traffic to the circuit.

These are our current parameters: BFD TX Interval = 150ms (i.e. BFD periodic packets) BFD multiplier = 3 BFD Detect time = 1503 = 450ms BGP Hold down = 90s (i.e. Juniper default, 330) BFD Holddown-Interval = 40s (i.e. we want BFD to be up for 40s before transitioning BGP to Established state

Which similar value does that sentence refer to? In our case, that 40s BFD Holddown-Interval is not similar to any values :-).

Thanks Albert PS: Apology if I am a bit slow :-).

On Tue, Jul 11, 2023 at 8:56 AM jhaas-pfrc @.***> wrote:

Albert,

On Jul 10, 2023, at 11:16 PM, Albert Fu @.***> wrote:

Hi Jeff,

Sorry, for some reason I was not able to click the git hub link at the Bloomberg offce today, so I am responding from home.

I have one question about the changes, on line 243: "It is RECOMMENDED that the BFD hold-down intervals used with BFD strict-mode, when configured, use similar values."

What does this sentence mean?

If you think about the headaches of holddown, even in existing pre-strict implementations, part of it is when BGP goes to Established on each side. However, whether that matter depends on whose implementation of holddown we're talking about.

This draft has tried very hard to not be normative about holddown because it does vary per vendor.

BFD strict says "don't let us transition out of OpenConfirm until BFD is Up".

holddown says "BFD is up only after it's up and the holddown interval has concluded"

When we transition to the OpenConfirm state, we start the BGP hold timer.

Thus, bgp hold timer > effective bfd holddown interval.

That part is hopefully obvious. Where it's messy is "when does BFD start?". Juniper's implementation will start BFD before BGP gets kicked off rather than started when BGP moves out of the Idle into the Connect states. This is because our BFD is an asynchronous mechanism and has programming latency which may depend on system scale. So, if you considered your configured holddown time to be as if it was instantaneous, or withiin two seconds or so of when we transition to Connect, you'd want that holddown number to be smaller than the hold timer.

The less obvious part is that holddown configuration is NOT negotiated. The holdtimer is either negotiated, or defaults to 30 seconds for comparison purposes in our draft. So, if side A configures a hold-down interval << side B, side A can transition into the Established state because BFD was Up, the hold-down time for A was satisfied, but B is still waiting. B will then be waiting and not transitioning and then A will eventually expire its holdtimer.

The holddown timers have to be symmetric enough or within tolerances of the bgp holdtimer to do the job.

Did you have better text to recommend to cover this point?

Also, Section 4 of our current draft mentions this statement: "The proposed default BGP BFD Hold time value is 30 seconds. The BGP BFD Hold time value is configurable."

Why do we have to have a default BGP BFD Hold time value? If BFD is enabled, it will by definition have a BFD Hold Time value??

You skipped the prior sentence: " If the BFD session does not transition to the Up state, and the HoldTimer has been negotiated to a non-zero value, the BGP FSM will close the session appropriately."

This is a BGP hold time of last resort when BGP itself has a zero hold time. If we don't have this, we can linger in limbo forever.

Also, can you please include my address in the draft:

Albert Fu Bloomberg LP 731 Lexington Avenue New York NY 10022

Thanks Albert

On Mon, Jul 10, 2023 at 1:39 PM jhaas-pfrc @.***> wrote:

@jhaas-pfrc https://github.com/jhaas-pfrc requested your review on:

3

https://github.com/aceelindem/ietf-bgp-strict-mode/pull/3 Add text dealing with BFD hold-down.

— Reply to this email directly, view it on GitHub < https://github.com/aceelindem/ietf-bgp-strict-mode/pull/3#event-9777764094>,

or unsubscribe < https://github.com/notifications/unsubscribe-auth/AFFPJGSXM43IRSNKZUXGUV3XPQ46PANCNFSM6AAAAAA2E3TSCM>

. You are receiving this because your review was requested.Message ID: @.***>

— Reply to this email directly, view it on GitHub < https://github.com/aceelindem/ietf-bgp-strict-mode/pull/3#issuecomment-1630040953>, or unsubscribe < https://github.com/notifications/unsubscribe-auth/ABRM5QAVO5KUZWO2GDIB56TXPTAQJANCNFSM6AAAAAA2E3TSCM>.

You are receiving this because you were mentioned.

— Reply to this email directly, view it on GitHub https://github.com/aceelindem/ietf-bgp-strict-mode/pull/3#issuecomment-1630779647, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFFPJGRZFBD7JIOXXX4ACLLXPVEPPANCNFSM6AAAAAA2E3TSCM . You are receiving this because your review was requested.Message ID: @.***>

albertf1 commented 6 months ago

Quick comment.

I think "similar" in this context means the "hold-down" value on both sides should be similar right?

In our case, we are thinking of configuring "hold-down" on one side only (the PE router side), and have no hold-down on the CE side (some vendors e.g. Nokia does not support hold-down).

I think as long as the "hold-down" value is << BGP Hold-time (default 90s), it should not be an issue??