canonical / cloud-init

Official upstream for the cloud-init: cloud instance initialization
https://cloud-init.io/
Other
2.99k stars 881 forks source link

Can't find proper metadata source IP - Interoperability problem with CentOS8/Stream, NetworkManager and Apache CloudStack #3839

Open ubuntu-server-builder opened 1 year ago

ubuntu-server-builder commented 1 year ago

This bug was originally filed in Launchpad as LP: #1915216

Launchpad details
affected_projects = []
assignee = None
assignee_name = None
date_closed = None
date_created = 2021-02-09T23:35:27.906402+00:00
date_fix_committed = None
date_fix_released = None
id = 1915216
importance = medium
is_complete = False
lp_url = https://bugs.launchpad.net/cloud-init/+bug/1915216
milestone = None
owner = jdoe666
owner_name = Peter M.
private = False
status = confirmed
submitter = jdoe666
submitter_name = Peter M.
tags = []
duplicates = []

Launchpad user Peter M.(jdoe666) wrote on 2021-02-09T23:35:27.906402+00:00

System environment: Apache CloudStack 4.11; KVM zone

In CentOS 8 either Upstream, there is NetworkManager. cloud-init currently packaged there is 20.3-9.el8.

We are talking about the code of the CloudStack datasource.

What we observe, is that on our CentOS test systems, cloud-init jumps into the default_gateway() method to return VR IP address 192.xxx.xxx.1. This is however wrong, this IP does not return metadata. To compare, an Ubuntu 20.04 deployed on same network resolves to 192.xxx.xxx.5.

This IP can be found under /run/NetworkManager:

./NetworkManager/resolv.conf:nameserver 192.xxx.xxx.5 ./NetworkManager/no-stub-resolv.conf:nameserver 192.xxx.xxx.5 ./NetworkManager/devices/2:next-server=192.xxx.xxx.5

While CloudStack datasource follows several approaches to find the IP, the code does not seem to implement the situation when there is NetworkManager.

What happens instead:

Would you say this is a bug, or maybe a missing feature to ensure interoperability with NetworkManager? (in terms that cloudinit does not look under /run/NetworkManager/)

ubuntu-server-builder commented 1 year ago

Launchpad user Peter M.(jdoe666) wrote on 2021-02-10T00:20:36.209793+00:00

P.S. asked also at https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/658

ubuntu-server-builder commented 1 year ago

Launchpad user James Falcon(falcojr) wrote on 2021-02-10T20:18:10.540098+00:00

Based on the process you've laid out, as well as the documentation (http://docs.cloudstack.apache.org/projects/cloudstack-administration/en/4.8/virtual_machines/user-data.html), it looks like the metadata service should be at the same IP as a DHCP server, which explains the steps taken. All the steps taken are various ways to determine your DHCP server, while falling back to your current gateway.

I'm not sure what is unique about your setup that these steps aren't working, however, checking "resolv.conf" isn't a valid solution. While it's true that a DHCP and DNS server may often reside at the same IP, that isn't guaranteed to be the case, and in most cases checking DNS is "more wrong" than inspecting DHCP leases.

Is the data-server DNS entry not working for you?

ubuntu-server-builder commented 1 year ago

Launchpad user Dave(livegrenier) wrote on 2021-09-03T19:58:38.161908+00:00

Hello,

I am seeing the same problem under cloudstack 4.15 + Xen when using a shared network, since i am using a shared network the DHCP server is not the same as the gateway, therefor cloud-init ends up failing with the logs showing it is trying to use the gateway to fetch the metadata.

I see the same behaviour on CentOS 8 and Rocky Linux.

I have also attempted to play with the NETWORKD_LEASES_DIR setting but did not have any luck, i am open to provide more information or try any workarounds if someone can help.

Thanks.

Regards.

ubuntu-server-builder commented 1 year ago

Launchpad user Dave(livegrenier) wrote on 2021-10-13T07:56:45.746799+00:00

Hi,

Please let me know if i can provide any more info to help troubleshoot with this problem.

Thanks.