Closed ubuntu-server-builder closed 1 year ago
Launchpad user Ryan Harper(raharper) wrote on 2020-04-08T21:07:29.076611+00:00
Thanks for reporting the bug.
Would you be able to run 'cloud-init collect-logs' and attach the tarball? If not, providing /var/log/cloud-init.log would be useful in debugging the issue.
Thanks!
Launchpad user Ilwoo Park(eru1729) wrote on 2020-04-09T09:18:11.335971+00:00
Thanks for the quick response.
I've attached the cloud-init.log from the affected server. Launchpad attachments: cloud-init.log
Launchpad user Chad Smith(chad.smith) wrote on 2020-04-14T20:34:10.159086+00:00
From the logs attached it looks to me like either cloud-init is not parsing the dhcp lease returned properly during EphemeralDHCP setup, or the dhcp response from your the dhcpserver on this network is sending out bogus values. It's strange to me to see cloud-init claiming it's setting up a subnet with an inaccessible broadcast addr in your logs.
" Attempting setup of ephemeral network on eth0 with 10.54.62.43/32 brd 10.54.62.127"
From your bug I'm confused why the lease is saying the subnet for the dhcp addr is 255.255.255.255 and the router is at 10.54.62.1. Doesn't that CIDR 10.54.62.43/255.255.255.255 mean that the address has a subnet that is only 1 IP address wide, so it has no visibility to the router?
fixed-address 10.54.62.43; option subnet-mask 255.255.255.255; option routers 10.54.62.1;
Launchpad user Chad Smith(chad.smith) wrote on 2020-04-14T20:35:13.369459+00:00
From the logs attached it looks to me like either cloud-init setting up an invalid network.
" Attempting setup of ephemeral network on eth0 with 10.54.62.43/32 brd 10.54.62.127"
From your bug I'm confused why the lease is saying the subnet for the dhcp addr is 255.255.255.255 and the router is at 10.54.62.1. Doesn't the CIDR 10.54.62.43/255.255.255.255 mean that the address has a subnet that is only 1 IP address wide, so it has no visibility to the router?
fixed-address 10.54.62.43; option subnet-mask 255.255.255.255; option routers 10.54.62.1;
Launchpad user Chad Smith(chad.smith) wrote on 2020-04-14T20:58:17.541392+00:00
Given that the router lives at 10.54.62.1 it seems likely that most specific netmask this IP can have would be /26 or 255.255.255.192 in order to still see the router IP on 10.54.62.43's own subnet.
I may be misunderstanding something here though.
Launchpad user Ilwoo Park(eru1729) wrote on 2020-04-15T06:55:36.684100+00:00
Hi,
You're right about the subnet part. If router ip is inside the network address of the instance current implementation of cloud-init works fine.
Seems like I should clarify how we setup connectivity for each instance.
We're configuring our instance to delegate any packet directed other than itself to the router ip. Hypervisor captures the packet with router ip, then forwards the packet with routing protocol.
To achieve above, we're using /32 bit route prefix to each instance and set up local scope route to the router ip, which the ip cannot fall into the network address of the instance ip range.
Current implementation of cloud-init does not setup local scope routing entry to the router, and this breaks our instance's network configuration.
Let me give you example with instance ip ("10.254.0.2/32") and router ip ("10.254.0.1"), and compare desired result and current cloud-init implementation.
[What we expect] "_bringup_device()"
"_bringup_static_route"
[Cloud-init behavior]
"_bringup_device()"
"_bringup_static_route"
Hope this comment clarify our findings.
If you need more information, please let me know.
Regards, Ilwoo
Launchpad user Launchpad Janitor(janitor) wrote on 2020-06-15T04:17:15.129760+00:00
[Expired for cloud-init because there has been no activity for 60 days.]
This bug was originally filed in Launchpad as LP: #1871323
Launchpad details
Launchpad user Ilwoo Park(eru1729) wrote on 2020-04-07T08:20:20.669684+00:00
Cloud Provider: OpenStack (Stein) Distro: Ubuntu 16.04 Cloud-init version: 19.4-33-gbb4131a2-0ubuntu1~16.04.1
Problem:
Since cloud-init introduced support of classless static route, cloud-init fails to add route to the gateway in our environment.
Looking through the code, I believe the following code should be patched as follows.
https://github.com/canonical/cloud-init/blob/master/cloudinit/net/__init__.py#L1113
Can someone verify the issue and give comment on suggested fix?
Here's a sample log of cloud-init with DEBUG flag set.
... 2020-04-07 02:51:55,949 - util.py[DEBUG]: Running command ['ip', '-family', 'inet', 'addr', 'add', '10.54.62.43/32', 'broadcast', '10.54.62.127', 'dev', 'eth0'] with allowed return codes [0] (shell=False, capture=True) 2020-04-07 02:51:55,951 - util.py[DEBUG]: Running command ['ip', '-family', 'inet', 'link', 'set', 'dev', 'eth0', 'up'] with allowed return codes [0] (shell=False, capture=True) 2020-04-07 02:51:55,954 - util.py[DEBUG]: Running command ['ip', '-4', 'route', 'add', '10.54.62.1/32', 'via', '0.0.0.0', 'dev', 'eth0'] with allowed return codes [0] (shell=False, capture=True) 2020-04-07 02:51:55,956 - util.py[DEBUG]: Running command ['ip', '-4', 'route', 'add', '169.254.169.254/32', 'via', '10.54.62.1', 'dev', 'eth0'] with allowed return codes [0] (shell=False, capture=True) 2020-04-07 02:51:55,959 - handlers.py[DEBUG]: finish: init-local/search-OpenStackLocal: FAIL: no local data found from DataSourceOpenStackLocal 2020-04-07 02:51:55,959 - util.py[WARNING]: Getting data from <class 'cloudinit.sources.DataSourceOpenStack.DataSourceOpenStackLocal'> failed 2020-04-07 02:51:55,959 - util.py[DEBUG]: Getting data from <class 'cloudinit.sources.DataSourceOpenStack.DataSourceOpenStackLocal'> failed Traceback (most recent call last): File "/usr/lib/python3/dist-packages/cloudinit/sources/init.py", line 760, in find_source if s.update_metadata([EventType.BOOT_NEW_INSTANCE]): File "/usr/lib/python3/dist-packages/cloudinit/sources/init.py", line 649, in update_metadata result = self.get_data() File "/usr/lib/python3/dist-packages/cloudinit/sources/init.py", line 273, in get_data return_value = self._get_data() File "/usr/lib/python3/dist-packages/cloudinit/sources/DataSourceOpenStack.py", line 130, in _get_data with EphemeralDHCPv4(self.fallback_interface): File "/usr/lib/python3/dist-packages/cloudinit/net/dhcp.py", line 57, in enter return self.obtain_lease() File "/usr/lib/python3/dist-packages/cloudinit/net/dhcp.py", line 109, in obtain_lease ephipv4.enter() File "/usr/lib/python3/dist-packages/cloudinit/net/init.py", line 986, in enter self._bringup_static_routes() File "/usr/lib/python3/dist-packages/cloudinit/net/init.py", line 1040, in _bringup_static_routes ['dev', self.interface], capture=True) File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 2102, in subp cmd=args) cloudinit.util.ProcessExecutionError: Unexpected error while running command. Command: ['ip', '-4', 'route', 'add', '169.254.169.254/32', 'via', '10.54.62.1', 'dev', 'eth0'] Exit code: 2 Reason: - Stdout: Stderr: RTNETLINK answers: Network is unreachable ...
Sample lease file and interface address setup are as follows.
cat /var/lib/dhcp/eth0.lease
lease { interface "eth0"; fixed-address 10.54.62.43; option subnet-mask 255.255.255.255; option routers 10.54.62.1; option dhcp-lease-time 4294967295; option dhcp-message-type 5; option domain-name-servers 10.20.30.40; option dhcp-server-identifier 10.54.62.1; option interface-mtu 1500; option rfc3442-classless-static-routes 32,10,54,62,1,0,0,0,0,32,169,254,169,254,10,54,62,1,0,10,54,62,1; option broadcast-address 10.54.62.127; option host-name "host-10-54-62-43"; option domain-name "local"; renew 0 2088/04/25 06:42:22; rebind 0 2139/05/10 15:07:51; expire 5 2156/05/14 09:56:29; }
ifconfig eth0
eth0 Link encap:Ethernet HWaddr ab:cd:ef:a1:50:a8
inet addr:10.54.62.43 Bcast:10.54.62.127 Mask:255.255.255.255 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:12748 errors:0 dropped:0 overruns:0 frame:0 TX packets:12123 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:50000 RX bytes:1757625 (1.7 MB) TX bytes:1262391 (1.2 MB)
route -n
Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 10.54.62.1 0.0.0.0 UG 0 0 0 eth0 10.54.62.1 0.0.0.0 255.255.255.255 UH 0 0 0 eth0 169.254.169.254 10.54.62.1 255.255.255.255 UGH 0 0 0 eth0