usvi / networking

0 stars 0 forks source link

Make scripts tolerant to failed DHCP renews #1

Open usvi opened 3 years ago

usvi commented 3 years ago

This morning I lost connectivity on my shell. Contrary to my normal panic operations of restarting everything, I waited. All other virtual machines and virtual connections were fine, only one machine with dedicated connection had failed.

Finally logs showed this:

Oct 20 07:01:31 gw dhcpcd[1447]: virtual1: failed to renew DHCP, rebinding ... Oct 20 07:46:31 gw dhcpcd[1447]: virtual1: DHCP lease expired Oct 20 07:46:31 gw dhcpcd[1447]: virtual1: deleting host route to 81.197.101.94 via 127.0.0.1 Oct 20 07:46:31 gw dhcpcd[1447]: virtual1: deleting route to 81.197.64.0/18 Oct 20 07:46:31 gw dhcpcd[1447]: virtual1: deleting default route via 81.197.64.1 Oct 20 07:46:31 gw dhcpcd[1447]: virtual1: soliciting an IPv6 router Oct 20 07:46:31 gw dhcpcd[1447]: virtual1: soliciting a DHCP lease Oct 20 07:46:31 gw dhcpcd[1447]: virtual1: offered 81.197.77.242 from 193.229.4.196 Oct 20 07:46:33 gw dnsmasq[32578]: reading /var/run/dnsmasq/resolv.conf Oct 20 07:46:33 gw dnsmasq[32578]: using nameserver 193.229.0.40#53 Oct 20 07:46:33 gw dnsmasq[32578]: using nameserver 193.229.0.42#53 Oct 20 07:46:36 gw dhcpcd[1447]: virtual1: leased 81.197.77.242 for 3600 seconds Oct 20 07:46:36 gw dhcpcd[1447]: virtual1: adding host route to 81.197.77.242 via 127.0.0.1 Oct 20 07:46:36 gw dhcpcd[1447]: virtual1: adding route to 81.197.64.0/19 Oct 20 07:46:36 gw dhcpcd[1447]: virtual1: adding default route via 81.197.64.1 Oct 20 07:46:36 gw logger: virtual1 (BOUND): IP: -> 81.197.77.242; GW: -> 81.197.64.1 Oct 20 07:46:38 gw dnsmasq[32578]: reading /var/run/dnsmasq/resolv.conf Oct 20 07:46:38 gw dnsmasq[32578]: using nameserver 193.229.0.40#53 Oct 20 07:46:38 gw dnsmasq[32578]: using nameserver 193.229.0.42#53

So, renew had failed. And scripts did not take into account this. I had discussion with ISP, and they say this happens almost never. But still, I need to fix my scripts to take this into account.

Especially this line is revealing: Oct 20 07:46:36 gw logger: virtual1 (BOUND): IP: -> 81.197.77.242; GW: -> 81.197.64.1

So, we had earlier stored empty strings as IP addresses. No wonder stuff broke down.

usvi commented 3 years ago

Hmm, network has changed also here, could it have something to do with breakages? And porbably not about empty strings.

usvi commented 3 years ago

Lease finally terminated 07:46:31, new one was actually get 5 seconds later and then network broke, if my memory serves me right.

usvi commented 3 years ago

Worth looking at https://discourse.pi-hole.net/t/how-to-configure-subnet-mask-in-dhcp-options/7930 for tests