Open ubuntu-server-builder opened 1 year ago
Launchpad user Scott Moser(smoser) wrote on 2017-12-21T18:12:27.601017+00:00
This is true that hostname is not set before networking comes up. I would like to fix this, but there are a couple things to consider a.) network datasources currently there are some datasources that run only after networking comes up. As it is right now it is "too late"to read the hostname from the network metadata service and then update the system hostname before dhcp would run.
b.) systemd-networkd's dhcp client seems to actually be listening for hostname getting set and updating its lease information on that event. we saw this in azure when we were removing the old 'bounce the network' code that served the purpose of publishing the
c.) relying on the guest to populate dns information via dhcp is kind of garbage anyway. as a "cloud" solution anyway.
d.) cloud-init allows setting hostname in user-data (in addition to meta-data). the user-data provided by the user could be in a '#include' url, which might not be available until all networking is up. Thus, even if we moved network datasources to pull their information 'pre-network' (the way that the digital ocean md service does) we can't consume all the user-data at that point.
'd' might be a reasonable limitation. the other things are acheivable.
Launchpad user Michael Hudson-Doyle(mwhudson) wrote on 2017-12-21T22:18:28.617626+00:00
For a and d, sure if finding out what the hostname needs to be involves having the network up, there's nothing that can be done to avoid this.
For c, yes, this is kind of garbage. Utah depends on this though :/ Maybe I can get it to edit the libvirt network config to map the MAC address to a particular IP address instead, that would definitely be less fragile...
And finally for b, it would make sense that a hostname change triggers a refresh of the DHCP lease but I see nothing in the code to do this and my experiments don't seem to indicate it happening either.
Launchpad user Birger Schmidt(bs-ubo) wrote on 2018-06-19T17:52:24.008293+00:00
I just stumbled over this bug as well.
Reading all the cases (a,b,c...) I do not see the downside in just setting the hostname in the init-local stage as well.
This can be done as an additional step only if the info is already there (i.e. mounted via iso). To check that would not take long and neither would setting the hostname take long.
Please consider adding this functionality and in case you decide against it please tell us what you think the downside of this would be.
As a side note: A similar request can be solved at the same time. See here https://bugs.launchpad.net/cloud-init/+bug/1643688.
Launchpad user Jesse R(scronkfinkle) wrote on 2022-07-14T14:32:30.747415+00:00
I am also running into this issue. We run DNSMasq and build out our cloud-init images with terraform. We're getting some pretty nasty networking issues because when we roll out any new batches of machines, they all request an IP with the hostname "Ubuntu", and then set their hostname afterwards.
Noticing the age of this ticket, has a better workaround for this kind of behavior been implemented that I missed? It's a pretty big blocker for us, and it seems reasonable to just be able to set the hostname in the local stage
Launchpad user Chad Smith(chad.smith) wrote on 2022-07-18T16:05:29.369848+00:00
@Jesse thanks for the bump and notes on this bug, since the origin of this bug we had added a related feature which allows init-lovel based datasources to set the hostname before network is brought online[1]. From my recollection of the feature, it requires that the datasource meta-data.local-hostname[2] (not user-data.hostname) to provide "local-hostname" config.
If you get a chance would you be able to:
sudo collect-logs -u
. Note that this collect-logs will include user-data, so please double check to make sure you don't have sensitive information (passwords/credentials) provided from the user-data/meta-data provided during launch. Thank you, the attached logs will help confirm suspicions on why this feature isn't quite enough for terraform type deployments.
References:
[1] https://github.com/canonical/cloud-init/commit/133ad2cb327ad17b7b81319fac8f9f14577c04df [2] https://github.com/canonical/cloud-init/blob/main/cloudinit/sources/__init__.py#L754
Launchpad user Jesse R(scronkfinkle) wrote on 2022-07-19T15:23:43.099916+00:00
@Chad thanks for writing back! Attached is the collect-logs output.
For terraform, we're using a provider to hook into our proxmox infrastructure. Under the hood, proxmox is calling QEMU to manage the virtual machines. I installed qemu-guest-agent
to the cloud-init image using virt-customize
from the libguestfs-tools
package.
On first boot, the hostname is successfully set, but it doesn't appear to be fast enough before networking is brought up.
To build an identical image to the one i'm using: Download the cloudinit image
wget https://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64.img
use virt-customize
to install qemu-guest-agent
sudo virt-customize -a https://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64.img --install qemu-guest-agent
From there, we upload it to proxmox and have it clone the image for VM's. I would imagine if one used regular Qemu or another provider with terraform the behavior would be the same.
Here's the output of terraform apply
module.greeks["attis"].proxmox_vm_qemu.basic_admin: Refreshing state... [id=aramis5/qemu/111]
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
+ create
Terraform will perform the following actions:
# module.greeks["attis"].proxmox_vm_qemu.basic_admin will be created
+ resource "proxmox_vm_qemu" "basic_admin" {
+ additional_wait = 0
+ agent = 1
+ automatic_reboot = true
+ balloon = 0
+ bios = "seabios"
+ boot = "c"
+ bootdisk = "scsi0"
+ ciuser = "awx"
+ clone = "ubuntu-2004-cloudinit-terraform"
+ clone_wait = 0
+ cores = 4
+ cpu = "host"
+ default_ipv4_address = (known after apply)
+ define_connection_info = true
+ force_create = false
+ full_clone = true
+ guest_agent_ready_timeout = 100
+ hotplug = "network,disk,usb"
+ id = (known after apply)
+ ipconfig0 = "ip=dhcp"
+ kvm = true
+ memory = 8192
+ name = "attis"
+ nameserver = (known after apply)
+ numa = false
+ onboot = false
+ oncreate = true
+ os_type = "cloud-init"
+ preprovision = true
+ reboot_required = (known after apply)
+ scsihw = "virtio-scsi-pci"
+ searchdomain = (known after apply)
+ sockets = 1
+ ssh_host = (known after apply)
+ ssh_port = (known after apply)
+ sshkeys = "<trimmed>"
+ tablet = true
+ target_node = "aramis5"
+ unused_disk = (known after apply)
+ vcpus = 0
+ vlan = -1
+ vmid = (known after apply)
+ disk {
+ backup = 0
+ cache = "none"
+ discard = "on"
+ file = (known after apply)
+ format = (known after apply)
+ iothread = 0
+ mbps = 0
+ mbps_rd = 0
+ mbps_rd_max = 0
+ mbps_wr = 0
+ mbps_wr_max = 0
+ media = (known after apply)
+ replicate = 0
+ size = "32G"
+ slot = 0
+ ssd = 0
+ storage = "ceph-external"
+ storage_type = (known after apply)
+ type = "scsi"
+ volume = (known after apply)
}
+ network {
+ bridge = "vmbr0"
+ firewall = false
+ link_down = false
+ macaddr = (known after apply)
+ model = "virtio"
+ queues = (known after apply)
+ rate = (known after apply)
+ tag = -1
}
}
attis
is the desired hostname that we want for this particular machine
Launchpad attachments: cloud-init-sanitized.tar.gz
Launchpad user Jesse R(scronkfinkle) wrote on 2022-08-23T16:09:15.137518+00:00
I wanted to give an update to this with a fix for anyone else that runs into my particular issue. The first was that using virt-customize
install qemu-guest-agent
was setting /etc/machine-id
. This caused dnsmasq to assign the same CLID to each VM. I assume that means it thought all the VM's were the same machine, requesting an IP on different interfaces. The way to fix that was to truncate the file after installation with
sudo virt-customize -a https://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64.img --truncate /etc/machine-id
With that sorted out, I was also able to use nocloud to set the hostname properly on boot. I used the method of setting the SMBIOS serial. In terraform I was able to specify this as QEMU args like so
args = "-smbios type=1,serial=ds=nocloud-net;h=${var.name}"
where var.name
was the hostname.
This bug was originally filed in Launchpad as LP: #1739516
Launchpad details
Launchpad user Michael Hudson-Doyle(mwhudson) wrote on 2017-12-21T02:05:24.786441+00:00
When boot with libvirt a disk image that has been installed with subiquity which has the workaround for bug 1737630 applied, i.e. networkd starts automatically, I cannot ping the VM by hostname from the host.
I think this is because the networking has come up before the hostname is set, so the hostname is not sent along with the DHCP request to libvirt's dnsmasq and so that dnsmasq cannot answer lookups for the hostname. If I run "netplan apply" on the vm, enough things are apparently restarted that DHCP happens again and I can ping the vm by hostname from the host.
I'm not completely sure I have diagnosed this correctly and certainly don't know how to fix it.