Closed Klaas- closed 6 years ago
Good catch, we should probably fix the HPC kickstarts to match standard CentOS.
The official CentOS images generally don't use networkmanager for the primary NIC either: https://github.com/CentOS/sig-cloud-instance-build/blob/master/cloudimg/CentOS-7-x86_64-Azure.ks#L120
@szarkos the RHEL Images use NetworkManager, NetworkManager will be the only option in the upcoming centos/rhel8. So I am not sure which one would be considered "right".
The hostname problem however is something I am completely unsure about. What is considered best-practice from an Azure point of view? If I name mv vm "shortname" hostname will be "shortname" with NetworkManager, but "shortname.domain.tld" with legacy networking. When I name my vm "shortname.domain.tld" hostname will be fqdn with both networking types.
Hi Klaas,
I'm probably missing something, but I'm not observing the hostname behavior you're describing with the NM_CONTROLLED parameter on or off. Can you provide more details on this?
It's been a couple years, but I believe we saw issues with NetworkManager in early CentOS 7 versions where multi-nic VMs did not get their routes set correctly. Disabling NM for the primary NIC and using the legacy scripts proved to be a well-tested failsafe. Otherwise I don't think we have strong opinions here, for RHEL8 we'll certainly take config guidance from upstream and test it thoroughly.
Thanks, Steve
@szarkos the hostname differences between networkmanager and no networkmanager (and one with vmname = fqdn)
Creation of VMs:
az vm create --name host1 --resource-group resourcegrp1 --private-ip-address 10.0.0.65 --public-ip-address "" --size Standard_B2s --image OpenLogic:CentOS-HPC:7.4:latest --admin-username username --ssh-key-value "ssh-rsa mykey" --subnet "mynet"
az vm create --name host2 --resource-group resourcegrp1 --private-ip-address 10.0.0.66 --public-ip-address "" --size Standard_B2s --image OpenLogic:CentOS:7.5:latest --admin-username username --ssh-key-value "ssh-rsa mykey" --subnet "mynet"
az vm create --name host3.domain.tld --resource-group resourcegrp1 --private-ip-address 10.0.0.67 --public-ip-address "" --size Standard_B2s --image OpenLogic:CentOS-HPC:7.4:latest --admin-username username --ssh-key-value "ssh-rsa mykey" --subnet "mynet"
Observe difference in hostname handling
## ssh username@host1 hostname -f
host1
## ssh username@host2 hostname -f
host2.domain.tld
## ssh username@host3 hostname -f
host3.domain.tld
See differences in resolv.conf searchdomains
## ssh username@host1 cat /etc/resolv.conf
# Generated by NetworkManager
search reddog.microsoft.com
nameserver 10.0.0.1
nameserver 10.0.0.2
## ssh username@host2 cat /etc/resolv.conf
; generated by /usr/sbin/dhclient-script
search reddog.microsoft.com domain.tld
nameserver 10.0.0.1
nameserver 10.0.0.2
## ssh username@host3 cat /etc/resolv.conf
# Generated by NetworkManager
search reddog.microsoft.com domain.tld
nameserver 10.0.0.1
nameserver 10.0.0.2
Differences in hostnamectl
## ssh username@host1 hostnamectl
Static hostname: host1
Icon name: computer-vm
Chassis: vm
Machine ID: x
Boot ID: x
Virtualization: microsoft
Operating System: CentOS Linux 7 (Core)
CPE OS Name: cpe:/o:centos:centos:7
Kernel: Linux 3.10.0-693.21.1.el7.x86_64
Architecture: x86-64
## ssh username@host2 hostnamectl
Static hostname: host2
Icon name: computer-vm
Chassis: vm
Machine ID: x
Boot ID: x
Virtualization: microsoft
Operating System: CentOS Linux 7 (Core)
CPE OS Name: cpe:/o:centos:centos:7
Kernel: Linux 3.10.0-862.3.3.el7.x86_64
Architecture: x86-64
## ssh username@host3 hostnamectl
Static hostname: host3.domain.tld
Icon name: computer-vm
Chassis: vm
Machine ID: x
Boot ID: x
Virtualization: microsoft
Operating System: CentOS Linux 7 (Core)
CPE OS Name: cpe:/o:centos:centos:7
Kernel: Linux 3.10.0-693.21.1.el7.x86_64
Architecture: x86-64
The question popping up right next is what do I want my hostname to be - my gut feeling is to use fqdn but this issue is about the differences between NetworkManager and legacy networking :)
Actually this does not seem to be NetworkManager, in journal logs I see this:
Jul 20 05:53:39 host1.domain.tld python[872]: 2018/07/20 05:53:39.748478 INFO Resource disk /dev/sdb is mounted at /
Jul 20 05:53:39 host1.domain.tld python[872]: 2018/07/20 05:53:39.781568 INFO Clean protocol
Jul 20 05:53:39 host1.domain.tld python[872]: 2018/07/20 05:53:39.799454 INFO Running default provisioning handler
Jul 20 05:53:39 host1.domain.tld python[872]: 2018/07/20 05:53:39.830778 INFO Copying ovf-env.xml
Jul 20 05:53:39 host1.domain.tld python[872]: 2018/07/20 05:53:39.933803 INFO Successfully mounted dvd
Jul 20 05:53:40 host1.domain.tld python[872]: 2018/07/20 05:53:40.023230 INFO Detect protocol by file
Jul 20 05:53:40 host1.domain.tld python[872]: 2018/07/20 05:53:40.036767 INFO Clean protocol
Jul 20 05:53:40 host1.domain.tld python[872]: 2018/07/20 05:53:40.042660 INFO WireServer endpoint is not found. Reru
Jul 20 05:53:40 host1.domain.tld python[872]: 2018/07/20 05:53:40.048366 INFO Test for route to 168.63.129.16
Jul 20 05:53:40 host1.domain.tld python[872]: 2018/07/20 05:53:40.048684 INFO Route to 168.63.129.16 exists
Jul 20 05:53:40 host1.domain.tld python[872]: 2018/07/20 05:53:40.053813 INFO Wire server endpoint:168.63.129.16
Jul 20 05:53:49 host1.domain.tld python[872]: 2018/07/20 05:53:49.072113 INFO Fabric preferred wire protocol version
Jul 20 05:53:49 host1.domain.tld python[872]: 2018/07/20 05:53:49.084486 INFO Wire protocol version:2012-11-30
Jul 20 05:53:49 host1.domain.tld python[872]: 2018/07/20 05:53:49.089333 WARNING Server preferred version:2015-04-05
Jul 20 05:53:53 host1.domain.tld python[872]: 2018/07/20 05:53:53.463861 INFO Starting provisioning
Jul 20 05:53:53 host1.domain.tld python[872]: 2018/07/20 05:53:53.477396 INFO Handle ovf-env.xml.
Jul 20 05:53:53 host1.domain.tld python[872]: 2018/07/20 05:53:53.482608 INFO Set hostname [host1]
Jul 20 05:53:53 host1 systemd-hostnamed[642]: Changed static host name to 'host1'
This seems to be the waagent setting the shortname on purpose. Is there a setting to control this?
Hi Klaas,
I think this is expected. Currently, there is no way to configure the DNS suffix from the fabric side, i.e. when you provision a VM. The search domain is configured by the platform. If you need to set your own FQDN or search domain then you'll need to modify dhclient.conf to override or augment what's set by the platform via DHCP.
We have more information about this here: https://docs.microsoft.com/en-us/azure/virtual-machines/linux/azure-dns
Thanks, Steve
Hi, for me the issue is that it's different with one networking style. This means some machines behave different than others. If I find the time I'll prepare PR to have HPC machines also use old-style networking.
For me the logical consequence of this is to name my azure vms with fqdn, this does not conform the the naming best practices https://docs.microsoft.com/en-US/azure/architecture/best-practices/naming-conventions. Or maybe I'll use something along those lines to update hostname to fqdn https://github.com/MicrosoftDocs/azure-docs/blob/master/articles/virtual-machines/linux/cloudinit-update-vm-hostname.md
Greetings Klaas
Hi, I have a question about the different networking styles used by the centos/centosHPC images. The normal CentOS does not use NetworkManager for ethernet (https://github.com/szarkos/AzureBuildCentOS/blob/master/ks/azure/centos75.ks#L125)
whereas CentOS-HPC uses NetworkManager (https://github.com/szarkos/AzureBuildCentOS/blob/master/ks/azure/centos74-hpc.ks#L127-L136 -- default is to use NetworkManager)
This has a weird consequence for me: without NetworkManager my hostname is the fqdn, with NetworkManager hostname is the shortname. This also means once my domain is added as searchdomain, once it is not.
I did not find documentation about this with centos/red hat so maybe this is a bug there.
Greetings Klaas