Open spolack opened 1 month ago
RFC 0.1 | Metric | >=Speed | Remarks |
---|---|---|---|
64-255 | 5 Gbit/s | For future use | |
256-2047 | 1 Gbit/s | 60Ghz | |
2048-4095 | 100 Mbit/s | 5GHz | |
4096-8191 | 50 Mbit/s | 5GHz PtMP | |
8192-16384 | 10Mbit/s | 2GHz / Default |
Metric: The first value should be tried first. In case of suboptimal routing the the range can be used as wiggle-room for individual traffic engineering.
By applying this scheme we would prefer:
Open question is where to put VPN. I'd prefer somewhere between 4096-8191. Also it should be taken in account that the max metric accross the whole path is uint16 (65535)
I think we should be a bit more granular especially in the area between 1 Gbit/s and 100 Mbit/s so we can set values that match the speed of the connections more closely e.g. by measuring it with iperf3 or looking at the capacities reported by UISP. In my opinion this would help in areas where there are multiple 5 GHz paths with a lot of variance in bandwidth. I know that this was meant with "wiggle room" but I would prefer a more detailed table as it leaves less room for interpretation. In addition I think it is also a good idea to have 2.5 Gbit/s as we already have it with the airFiber 60 Xtreme-Range (ak36<->teufelsberg), eventhough the cable connection/core-router is currently the limit there.
My proposal would be as follows:
Metric | Max Speed | Min Speed > | Remarks |
---|---|---|---|
64 - 127 | 10 Gbit/s | 5 Gbit/s | Fiber |
128 - 255 | 5 Gbit/s | 2.5 Gbit/s | Fiber |
256 - 511 | 2.5 Gbit/s | 1 Gbit/s | 60 GHz wireless/ethernet |
512 - 1023 | 1 Gbit/s | 500 Mbit/s | 60 GHz wireless/ethernet |
1024 - 2047 | 500 Mbit/s | 250 Mbit/s | 5 GHz wireless |
2048 - 3071 | 250 Mbit/s | 100 Mbit/s | 5 GHz wireless |
3072 - 4095 | 100 Mbit/s | 50 Mbit/s | 5 GHz wireless |
4096 - 6143 | 50 Mbit/s | 25 Mbit/s | 5 GHz / 2 GHz wireless |
6144 - 8191 | 25 Mbit/s | 10 Mbit/s | 5 GHz / 2 GHz wireless |
8192 - 12287 | 10 Mbit/s | 5 Mbit/s | 5 GHz / 2 GHz wireless |
12288 - 16383 | 5 Mbit/s | 1 Mbit/s | 5 GHz / 2 GHz wireless |
16384 - 32767 | 1 Mbit/s | 0 Mbit/s | Fallback, management link |
A connection like emma<->rhnk that has a capacity of 273 Mbit/s would be in the metric range of 1024 - 2047, with a default metric of 1024 or with an adjusted metric set to something closer to 2047.
My metrics might need further adjustments, but I think you get the idea of the more complete table.
Regarding VPN / backup uplinks I suggest a variant where we take the connection bandwidth into account. My proposal here is to take the downlink connection bandwidth to determine the row from aboves table and then go two or maybe even 3 rows down. With two rows a 250 MBit/s private uplink would get 4096. A 100 MBit/s private uplink would get 6144 and so on.
Thinking of whether it would be nice to find a way to have metrics in a bidirectional way, as usually 5GHz strongly depends on the direction. And how to model the data for PtMP APs per individual Station. Also it would be beneficial to find a way, how to aritificially double the cost for traffic entering and leaving the location on the same interface. (For instance wilgu10-sama -> sama-sued-60ghz -> w38b). In that case the cost of the routes from w38b should be added twice at sama-sued-60ghz, because the possible bandwidth is only the half.
And a third thought is that we could have a helper function, either written as a jinja2 macro or attach a python lookup plugin to our playbooks, where we pass the estimated Bandwidth, instead of manually calculating the cost. example:
mesh_metric: "{{ METRIC_FROM_BW(2000) }}"
I think having a function that translates bandwidth into a mesh_metric is a really good idea. Since most of the traffic is RX we can base everything on measured or estimated RX bandwidth between the node and the neighbor and also apply this to tunspace uplinks.
I can follow your argument with traffic entering and leaving the same location at the same interface and would love to see a solution eventhough I think we are also fine without one. In our current network topology (https://github.com/freifunk-berlin/bbb-configs/pull/1010 - contains a map which shows gateway selection) this kind of traffic is either totally valid (traffic within our network), or a cause of routing when we have outages and can't really avoid it.
We want to start routing IPv4 via Babel soonish. Lets make sure that the metrics are in the desired state to have a deterministic routing experience :)