scionproto / scion

SCION Internet Architecture
https://scion.org
Apache License 2.0
393 stars 160 forks source link

Proposal: Cross-Segment Hop Fields #4439

Closed matzf closed 4 months ago

matzf commented 12 months ago

Background

SCION path segments are a sequence of hop-fields that authorize to transit an AS from one "ingress" interface to a specific "egress" interface. The hop-fields are validated with a MAC based on the secret key of each AS. There is a special case for transitioning from one segment to the next. A router processes two hop fields in this case, the last hop field of one segment, and the first hop field of the next segment. Neither of these two hop fields corresponds to the "hop" effectively taken through the AS; these cross-segment hops are always implicitly authorized, the SCION model assumes that hops between interfaces of the right types (child-child, child-core, core-child) are always allowed.

Problems

Proposal

Introduce an explicit "cross-segment hop field" that is announced in beacons. This cross-segment hop field authorizes the hop between two interfaces in the AS at the cross-over point between two segments. This applies both to "shortcut" cross-overs (between an up and down segment) or for a core cross-over (up to core segment, core to down segment).

In the path-segment-combined data plane path, this hop field can (only) be used as the last(/first) hop field of a segment. It is the only hop field needed for this AS, the next(/previous) hop field is already the hop through the subsequent AS. Consequently, the end-to-end path is shorter by one hop field per segment cross-over compared to the current approach.

Illustration: "shortcut" segment cross-over (up- to down segment shortcut).

path-auth-xover-comparison-v3

Details

AS Entries

We extend the AS Entry with a repeated CrossEntry cross_entries , i.e. a list of cross-segment hop entries. Each of these cross-hops contains the interface IDs, expiration time and MAC to authorize the cross-segment hop from the beacon's egress interface to a specific range (see below) of other interfaces.

message ASEntrySignedBody {
 message ASEntrySignedBody {
     // The required regular hop entry.
     HopEntry hop_entry = 3;
     // Optional peer entries.
     repeated PeerEntry peer_entries = 4;
+    // Optional cross-segment hop entries.
+    // Each entry refers to the hop between `cross_entry.interface` and `hop_entry.egress` .
+    repeated CrossEntry cross_entries = 7;
     // ...
}
message CrossEntry {
   oneof interface { // different representation options to minimize size overhead (needs to be checked and further tweaked)
     uint32 id = 1;
     InterfaceRange range = 2;
     repeated InterfaceRange ranges = 3;
   }
    uint32 exp_time = 3;
    // MAC used in the dataplane to verify the hop field.
    bytes mac = 4;
}

// Range of interfaces
message InterfaceRange {
    uint32 first = 1; // inclusive
    uint32 last = 2;  // inclusive
}

The cross-segment hop fields are announced in the beacons of the intra-ISD beaconing, but not in the core beaconing. It's sufficient to include the cross-segment hop information in the in the up/down segments, as any type of segment cross in SCION over involves an up/down segment (there are no core-core segment cross overs).

Interface ranges

Announcing cross-segment hops for every allowed combination of interface pairs in very large ASes (with many 1000s of child/core interfaces) could lead to hugely inflated path-construction beacons. For the max. number of interfaces in an AS (16 bits), this size overhead could be on the order of 1MB per AS entry if done naively (summing to ~ 64GB overhead across interfaces).

To address this scaling concern, we can reuse the same hop-field for ranges of interface IDs. The MAC is computed for one specific value identifying the range, e.g. the first interface ID in the range. In the hop field carried in an individual packet, we still only include the specific interface that the packet should traverse. During the verification of the hop field MAC, the router maps this interface back to the interface range and uses the corresponding input in the MAC computation. Typically, this will be nothing more than applying a bitmask; the details of the MAC computation is an AS-local choice.

Path segment combination

The path-segment combination, currently all segment crossings in one AS are considered allowed. The cross hops change this; only segment combinations which can be connected with a cross hop are considered allowed. For this, the segment combinator needs to take into account the interface ranges for which a cross hop is applicable.

Magic trick: the vanishing core segment

There is one special case to consider: the first and the last hop of a core segment may be replaced with a cross-hop, crossing over to an up/down segment. As an emerging feature of this proposal, core segments consisting of only a single inter-AS link (two hop fields) may be elided entirely!

The availability of the core segment is still crucial for the segment combination, as it is the information linking the two cross-hops. Note that the expiration timestamp now comes (solely) from the cross-hops. This could potentially allow using expired core-segment. This should be taken into account during the beaconing, ensuring that the cross-hops don't "live" longer than the intended lifetime of such a corresponding core-segment.

path-auth-xover-core-v1

MAC chaining

Background: Generally, the hop field MACs are chained, by including the previous (in construction direction) hop fields in the MAC input. Specifically, this is based on a 16-bit XOR of the preceding hop field MAC values. In the data plane, this XOR is not explicitly computed over all the MACs at every hop, but instead maintained as an mutable field in the packet header (currently called "SegID", see https://docs.scion.org/en/latest/protocols/scion-header.html#info-field). The purpose of the MAC chaining mechanism is to prevent inserting, removing, repeating or otherwise tampering with the order of hop fields in the path. This is particularly relevant for the core segments (as there is no inherent directionality in the core links that would help to detect or prevent loops).

The cross hop field is not part of this MAC chain. (It cannot be, otherwise the subsequent hop fields would need to be chained to different hop fields (the main hop field and all the different cross hop fields), which would result in a multiplicative explosion of hop field MACs to announce.) Instead, we use the same "trick" that is used for the peering hops. The cross hop field MACs are chained to the regular hop_entry MAC of its AS entry. Thus, it is chained to the same MAC as the regular hop entry in the next AS Entry (in construction direction). While processing in the router, it suffices to not update the SegID accumulator when processing the cross hop.

Claim: this this approach results in effectively the same tampering protection properties as the current segment cross over approach. A formal proof of this would be ideal.

Processing in the router

The processing of cross hops in the router is virtually identical to processing peering hops. We can reuse the same bit in the InfoField;

        0                   1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
-      |r r r r r r P C|      RSV      |             SegID             |
+      |r r r r r r X C|      RSV      |             SegID             |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                           Timestamp                           |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   r
       Unused and reserved for future use.
-  P
+  X
+      Cross-hop flag. If this is set, the top-most hop field of the path segment
+      is a cross-hop field. Top-most is the first in construction direction (C flag) / last otherwise.
+      A cross-hop field has special handling for MAC chaining.
   C
       Construction direction flag. If set to true then the hop fields are arranged
       in the direction they have been constructed during beaconing.

   ...

The special handling for the MAC chaining for the cross hops X flag is explained above. To repeat, the router just skips updating the SegID accumulator for the cross-hop fields.

In addition to the special MAC chaining for cross hops, the router may need to take into account interface ranges in the MAC computation. The router needs to map the current hop field ingress interface to the corresponding interface range and use the appropriate input for the MAC computation. As described above, this mapping is an AS local choice. If the ranges are suitably organized, this can be done with a simple bitmask operation.

Compatibility, Transition

The support in the router and control service can be minimal; as a first step, they can support only "catch-all" cross-hops that allow connecting any child-child or child-core interface pair. Later, they could e.g. allow expressing policies but without support for the interface ranges (as long as ASes are not very big). The endpoints (i.e. the path-segment combinator), however, should support the full range of the feature from the start, to allow rolling out the extended policy features in individual ASes without further transition period.

Discussion

shitz commented 12 months ago

Hi @matzf! Overall, I do like the proposal a lot, good work!

Some immediate questions about the interface ranges:

With your proposal I don't think it's possible to have overlapping ranges, is that correct? Or if there are overlaps, then the router would need to try potentially all of them during MAC verification, right?

Also, how does the processing work if there are multiple interface ranges in the CrossEntry? There would need to be something that indentifies the list of ranges. How can a router efficiently go from interface id to the identifier for the list of interface ranges to calculate the MAC? Would that also mean that an interface can only be in a single list of interface ranges?

matzf commented 12 months ago

Thanks!

Good question, I didn't think about overlapping interface ranges. From the endpoint's perspective, this probably does not complicate anything. The path-segment combinator can arbitrarily pick one of the applicable cross hops if the ranges overlap; very likely it would make sense to prefer the one with latest expiration. If we want to support overlapping ranges in the router, checking all candidate MACs, as you suggest, would work. As a (not entirely convincing) alternative, an AS could decide to encode a small index for the applicable range in the MAC part of the hopfield; cutting off a few bits of that MAC will probably not hurt.

Generally, my thought was that we should encode the information in the CrossEntry as generically as possible, and support this fully in the path-segment combinator, without making any assumptions. In the router, however, we only need to support what the control service of the local AS will announce in the beacons. Or, the other way around; the control service should know, hard-coded or configurable, which types of cross-hop encoding are supported by (which of) the AS's routers and should create the CrossEntry(s) accordingly. The endpoint, i.e. the path-segment combinator, should be generic because it's far away, not typically under the control of the AS where the router will apply the cross-hop logic. By keeping this generic, we avoid having to touch the endpoints when control service / router of a transit AS are extended.

In practice, I thought that we might start by supporting interface ranges of a single, configurable, power-of-two size n = 2^k, so that we'd have the ranges [0, (2^k)-1], ..., [i * (2^k), (i+1) * (2^k) - 1], ... If then use the value i * 2^k in the MAC input for range i, the router only needs to clear the lowest k bits in the (ingress-) interface ID for the MAC input.

Supporting more flexible ranges in the router is conceptually easy; it's just a lookup to find the matching interface range. Each interface range corresponds to a MAC input. If there are multiple interface ranges in the CrossEntry, there are just multiple ranges that correspond to the same MAC input. This table is a part of the routers configuration. Whether this can be done efficiently depends a lot on the underlying platform (and the performance expectations). I guess that anywhere we should be able to support stupid linear search up to a table size of some small number. Doing more than this will need some tricks, e.g. the power-of-two bitmask from above, or some other number sequence, or some platforms may have special components that are suitable to accelerate this. Either way it will likely be an implementation choice primarily of the router implementation, in coordination with the (local) control service.

mlegner commented 11 months ago

Really cool proposal, @matzf! 💯 Just a few thoughts/ideas from my side.

I'm not too happy about the interface ranges as they somewhat violate the statelessness property that SCION generally adheres to. However, I agree that the naive approach would not scale.

I may have a third suggestion of how to authenticate crossover hops, somewhat related to the idea of "encod[ing] a small index for the applicable range in the MAC part of the hopfield":

An AS can define (potentially overlapping) groups of interfaces identified by a short GroupID. Semantically, this would allow forwarding between any pair of interfaces that have at least one group in common.

An AS entry can then contain a set of "group authenticators" including a MAC over the SegID, GroupID, and child interface; such a group authenticator certifies that the child interface in the context of this beacon belongs to the corresponding group. When constructing a dataplane path, a cross-hop can be constructed from two such authenticators with the same GroupID. The crossover hop field would then contain the XOR of the two MACs.

An AS entry would then have at most as many additional authenticators as there are group IDs, although in the default case probably just one. As a result, this suggestion solves most of the stated problems without introducing a scalability issue (like the naive strawman approach) or statefulness on routers (as the interface ranges).

There are also a few downsides:

Please let me know what you think.

Should a path segment with only a cross-hop be allowed? This is ok for peering paths, where this occurs for paths that end in one of the peers. For the cross-hops it seems to make little sense. I don't see how this could be abused. If we'd need to prohibit this though, we could use interface types to distinguish cross hops from peering hops in the router.

We could mandate that for up->* crossovers the crossover field always is in the up-segment, and for core->down crossovers it is always in the down-segment. This is consistent with your "vanishing core-segment" example and also fits the peering hop fields.

In that case, I agree that there shouldn't arise the need to have segments that only contain a cross-hop. However, I also agree that this probably cannot be abused.

AS policies for which child-child/child-core interface transits should be allowed enable new topologies.

These are really interesting ideas. It's nice that we get additional flexibility while reducing communication and computation overhead (in the data plane) at the same time. 😁

matzf commented 8 months ago

This is a very elegant approach, @mlegner, nice!

You're right, the interface ranges in the proposal would require that the individual routers are configured with these ranges. These ranges would be "state" that the routers need to keep, in order to understand and enforce the AS's current transit policies. This is an important downside that I had not sufficiently considered. Your alternative proposal nicely fixes this issue, with the disadvantage (as you mention) of requiring special case processing with two MAC computations in the dataplane.

To me, neither approach seems like a strong enough improvement over the status quo. I'd suggest to shelf this idea, at least until someone has an epiphany on how to combine the advantages.


A clarification question regarding your suggestion; when you mention how to compute the MAC for the "group authenticator", you don't list the expiration timestamp. Is this intentional? Are you suggesting that we fix the expiration to a fixed value to avoid encoding it? If no: the two separate MACs can be announced with two different ExpTime values. We can't encode both of them, and I don't know how to fix this.

matzf commented 4 months ago

No epiphanies had so far, closing this.