GCP's default L3 network setup assigns a /32 subnet and uses a static route for the gateway:
2021-12-04 02:42:36.903426 I | got new lease: &{10.156.0.18 2021-12-05 02:42:36.902724599 +0000 UTC m=+86411.030064302 Subnet Mask: ffffffff
Router: 10.156.0.1
Domain Name Server: 169.254.169.254
Host Name: metropolis-1638585607.c.monogon-dev.internal
Domain Name: c.monogon-dev.internal
Interface MTU: [5 180]
NTP Servers: 169.254.169.254
IP Addresses Lease Time: 24h0m0s
DHCP Message Type: ACK
Server Identifier: 169.254.169.254
DNS Domain Search List: [c.monogon-dev.internal google.internal]
Classless Static Route: route to 10.156.0.1/32 via 0.0.0.0; route to 0.0.0.0/0 via 10.156.0.1
TFTP Server Address: [10 156 0 1]
} (options: Subnet Mask: ffffffff
Router: 10.156.0.1
Domain Name Server: 169.254.169.254
Host Name: metropolis-1638585607.c.monogon-dev.internal
Domain Name: c.monogon-dev.internal
Interface MTU: [5 180]
NTP Servers: 169.254.169.254
IP Addresses Lease Time: 24h0m0s
DHCP Message Type: ACK
Server Identifier: 169.254.169.254
DNS Domain Search List: [c.monogon-dev.internal google.internal]
Classless Static Route: route to 10.156.0.1/32 via 0.0.0.0; route to 0.0.0.0/0 via 10.156.0.1
TFTP Server Address: [10 156 0 1]
)
supervisor E1204 02:42:36.903905 supervisor_processor.go:225] Runnable root.network.interfaces.dhcp died: returned error when NODE_STATE_NEW: lease callback failed: failed to add default route via 10.156.0.1: network is unreachable
supervisor I1204 02:42:36.904079 supervisor_processor.go:431] rescheduling supervised node root.network.interfaces.dhcp with backoff 5.886853573s
supervisor E1204 02:42:37.107049 supervisor_processor.go:225] Runnable root.enrolment died: returned error when NODE_STATE_NEW: cannot restart cluster manager
supervisor I1204 02:42:37.108390 supervisor_processor.go:431] rescheduling supervised node root.enrolment with backoff 6.465232928s
(diff for logging)
```diff
diff --git a/metropolis/node/core/network/dhcp4c/callback/callback.go b/metropolis/node/core/network/dhcp4c/callback/callback.go
index 52276ab..a0cebd8 100644
--- a/metropolis/node/core/network/dhcp4c/callback/callback.go
+++ b/metropolis/node/core/network/dhcp4c/callback/callback.go
@@ -29,6 +29,7 @@ package callback
import (
"fmt"
+ "log"
"math"
"net"
"os"
@@ -71,6 +72,8 @@ func isIPNetEqual(a, b *net.IPNet) bool {
// clients on a single interface.
func ManageIP(iface netlink.Link) dhcp4c.LeaseCallback {
return func(old, new *dhcp4c.Lease) error {
+ log.Printf("got new lease: %v (options: %v)", new, new.Options.Summary(nil))
+
newNet := new.IPNet()
addrs, err := netlink.AddrList(iface, netlink.FAMILY_V4)
```
Hetzner and similar cloud providers are known for similar shenanigans. In the case of GCP, setting the barely documented --guest-os-features=MULTI_IP_SUBNET flag on the image seems to solve it:
2021-12-04 02:47:16.113171 I | got new lease: &{10.156.0.19 2021-12-05 02:47:16.112203034 +0000 UTC m=+86400.225447665 Subnet Mask: fffff000
Router: 10.156.0.1
Domain Name Server: 169.254.169.254
Host Name: metropolis-1638585907.c.monogon-dev.internal
Domain Name: c.monogon-dev.internal
Interface MTU: [5 180]
NTP Servers: 169.254.169.254
IP Addresses Lease Time: 24h0m0s
DHCP Message Type: ACK
Server Identifier: 169.254.169.254
DNS Domain Search List: [c.monogon-dev.internal google.internal]
Classless Static Route: route to 10.156.0.1/32 via 0.0.0.0; route to 0.0.0.0/0 via 10.156.0.1
TFTP Server Address: [10 156 0 1]
} (options: Subnet Mask: fffff000
Router: 10.156.0.1
Domain Name Server: 169.254.169.254
Host Name: metropolis-1638585907.c.monogon-dev.internal
Domain Name: c.monogon-dev.internal
Interface MTU: [5 180]
NTP Servers: 169.254.169.254
IP Addresses Lease Time: 24h0m0s
DHCP Message Type: ACK
Server Identifier: 169.254.169.254
DNS Domain Search List: [c.monogon-dev.internal google.internal]
Classless Static Route: route to 10.156.0.1/32 via 0.0.0.0; route to 0.0.0.0/0 via 10.156.0.1
TFTP Server Address: [10 156 0 1]
)
However, it's unclear whether ignoring the static route is always safe to do even with MULTI_IP_SUBNET (no guarantee the gateway is in the same subnet?), and whether there are any unexpected side effects from the flag.
GCP's default L3 network setup assigns a /32 subnet and uses a static route for the gateway:
(diff for logging)
```diff diff --git a/metropolis/node/core/network/dhcp4c/callback/callback.go b/metropolis/node/core/network/dhcp4c/callback/callback.go index 52276ab..a0cebd8 100644 --- a/metropolis/node/core/network/dhcp4c/callback/callback.go +++ b/metropolis/node/core/network/dhcp4c/callback/callback.go @@ -29,6 +29,7 @@ package callback import ( "fmt" + "log" "math" "net" "os" @@ -71,6 +72,8 @@ func isIPNetEqual(a, b *net.IPNet) bool { // clients on a single interface. func ManageIP(iface netlink.Link) dhcp4c.LeaseCallback { return func(old, new *dhcp4c.Lease) error { + log.Printf("got new lease: %v (options: %v)", new, new.Options.Summary(nil)) + newNet := new.IPNet() addrs, err := netlink.AddrList(iface, netlink.FAMILY_V4) ```Hetzner and similar cloud providers are known for similar shenanigans. In the case of GCP, setting the barely documented
--guest-os-features=MULTI_IP_SUBNET
flag on the image seems to solve it:However, it's unclear whether ignoring the static route is always safe to do even with
MULTI_IP_SUBNET
(no guarantee the gateway is in the same subnet?), and whether there are any unexpected side effects from the flag.