scionassociation / scion-cp_I-D

Specification of the SCION control plane.
https://scionassociation.github.io/scion-cp_I-D/
Other
1 stars 0 forks source link

Scalability of path discovery #42

Closed matzf closed 1 month ago

matzf commented 2 months ago

Scalability of path discovery.

Fixes #8

matzf commented 2 months ago

Thanks for fixing my typos, @nicorusti.

tzaeschke commented 2 months ago

General Note:

Some points I found could be useful to add to the doc:

Introduction

Avoiding Circular Dependencies and Partitioning

Partition and Healing

Path Exploration or Beaconing {#beaconing}

Introduction and Overview

Extending a PCB

Path-Segment Construction Beacons

PCB Validity

Propagation of PCBs {#path-prop}

Selection of PCBs to Propagate {#selection}

Storing and Selecting Candidate PCBs

Effects of Clock Inaccuracy

Path Discovery Time and Scalability {#scalability}

Intra-ISD Beaconing

Inter-ISD Beaconing

nicorusti commented 1 month ago

Thank you for your feedback @tzaeschke ! I respond here regarding the scalability and clock inaccuracy sections. For other sections, and for points that we don't have time to address in time this revision, I opened separate issues:

Regarding Effects of Clock Inaccuracy

I think the word "typically" is misleading here, it can be understood as "PCBs [...] are validated typically after N minutes", whereas it actually means that the maximum is typically N minutes. Rephrase to "(maximally N minutes)"

Done, maximally N minutes sounds good.

Also, AFAIK, the current configured real-world interval is more like 10-15 minutes...?

@matzf I is it 1 min as in the draft, or 10-15? 10-15 feels a bit high to me

Rephrase The norm is 6 hours. to ... SHOULD be 6 hours ? What does 'norm' mean?

I am a bit reluctant to use RFC2119 language (uppercase SHOULD) for exactly 6 hours. This is a value that overall depends on the maximum AS path expected in the network, and it might as well be a different value. I therefore rephrased like this: For this reason, it is unadvisable to create hops with a short expiration time, that should be around 6 hours.

In comparison to these time scales, clock offsets in the order of minutes are immaterial. This relates only to the previous paragraph about certificates; I guess it should be attached to the previous paragraph?

Done.

Regarding Path Discovery Time and Scalability {#scalability}

The notes contain several references to "immediate cold-start PCB forwarding"

@matzf clarified here that this is not the case, I removed that note, this should also solve some many of the consistency issues you reported.

Also, I think the calculation needs to take into account the "beacon origination interval".

To be handled in https://github.com/scionassociation/scion-cp_I-D/issues/45

As will become apparent, the inter-ISD beaconing results in excessive overhead with very large numbers of participating core ASes. Does this need to be in the IETF spec?

Good point, I rephrased this section to: To achieve scalability in its routing process, SCION uses a divide-and-conquer approach, partitioning ASes into ISDs. In order to benefit from this, an ideal topology SCION should keep the inter-ISD core network to a moderate size. For more specific observations, we distinguish between intra- and inter-ISD beaconing.

What is done to ensure this? What happens if the size is not moderate? What is "moderate"?

We give some numbers in the Inter-ISD Beaconing section with an example with 1000 core ASes, this gives a rough figure. The bandwidth and computation overhead figures there should also give a rough hint of what happens if the network grows too much: the overhead becomes considerable. What is done to ensure this IMHO depends on how the network is deployed, I think this topic would be a better fit to be discussed in the new deployment Internet Draft, I opened an issue there: https://github.com/scionassociation/scion-deployment_I-D/issues/1

Typo: duplicated "from the link";

Fixed

matzf commented 1 month ago

Commented on the related issues for the other sections.

Also, AFAIK, the current configured real-world interval is more like 10-15 minutes...?

@matzf I is it 1 min as in the draft, or 10-15? 10-15 feels a bit high to me

The 1 minute value seems realistic. SCIONLab uses 5 seconds for non-core beaconing and 1 minute for core beaconing. Anapaya's infrastructure reportedly runs with 30s.