openinfrastructure / terraform-google-multinic

Connect two VPC networks with an auto-healing, auto-scaling group of IP router instances.
Apache License 2.0
10 stars 5 forks source link

[DRAFT] Improve policy based routing robustness #13

Closed jeffmccune closed 4 years ago

jeffmccune commented 4 years ago

This patch is intended to get policy based routing playing nicely with the following on the centos-cloud/centos-8 image:

systemctl restart NetworkManager
systemctl restart google-guest-agent

It's not working yet.

See: #10

jeffmccune commented 4 years ago

Next steps: Dig through the guest-agent source https://github.com/GoogleCloudPlatform/guest-agent

jeffmccune commented 4 years ago

Next steps are to have the startup script write /etc/dhcp/dhclient-down-hooks to avoid remove_old_address, which deletes routes from policy tables.

#! /bin/bash
#
# Avoid flushing policy based routing tables by preventing `ip addr del` from
# being called when dhclient is terminated.  ip addr del has the side effect of
# removing all routes in all tables associated with the address being deleted.
#
# This script avoids the call to remove_old_addr() by calling exit_with_hooks
# before hand to abort the tear down behavior in the EXPIRE|FAIL|RELEASE|STOP)
# block.
#
# See: https://github.com/GoogleCloudPlatform/guest-agent/issues/76
# See: https://github.com/openinfrastructure/terraform-google-multinic/issues/10

logmessage "Skipping ip -4 addr del ${old_ip_address:-}/${old_prefix:-} dev ${interface:-} to work around https://github.com/GoogleCloudPlatform/guest-agent/issues/76"
exit_with_hooks 0
jeffmccune commented 4 years ago

Why does the guest agent call dhclient -x instead of sending a signal?

If the client is killed by a signale (for example at shutdown or reboot), it will not execute the dhclient-script (8) at exit. However, if you shut the client down gracefully with -r or -x it will execute dhclient-script (8) at shutdown with the specific reason for calling the script set in the environment table.