Closed arithx closed 4 years ago
That's odd. What does /etc/resolv.conf
say? Logs from NetworkManager
?
Hmm, but clearly this has to be working on RHCOS. One difference I can think of is that RHCOS does check-in from the initrd, though I don't think checking in would be related to DNS.
/etc/resolv.conf
doesn't exist (likely because we aren't bringing down the networking in the initramfs like RHCOS is).
I've included /run/initramfs/state/etc/resolv.conf
as well as the journal for NetworkManager
(note that I did manually restart NetworkManager ealrier on via sudo systemctl restart NetworkManager
to try to see if that resolved it)
[core@networktest ~]$ cat /etc/resolv.conf
cat: /etc/resolv.conf: No such file or directory
[core@networktest ~]$ ls /etc/
adjtime csh.cshrc fedora-release hosts libnl multipath pkcs11 rpm sssd tmpfiles.d
aliases csh.login filesystems idmapd.conf libreport netconfig pkgconfig rpm-ostreed.conf statetab.d trusted-key.key
alternatives dbus-1 fuse.conf inittab libssh NetworkManager pki rsyncd.conf subgid udev
bash_completion.d default gcrypt inputrc libuser.conf networks pm rwtab.d subgid- virc
bashrc depmod.d gnupg iproute2 login.defs nfs.conf polkit-1 samba subuid X11
bindresvport.blacklist dhcp GREP_COLORS iscsi logrotate.conf nfsmount.conf popt.d sasl2 subuid- xattr.conf
binfmt.d DIR_COLORS group issue logrotate.d nftables prelink.conf.d security sudoers xdg
chrony.conf DIR_COLORS.256color group- issue.d lvm nsswitch.conf printcap selinux sudoers.d yum.repos.d
chrony.keys DIR_COLORS.lightbgcolor grub2.cfg issue.net machine-id nsswitch.conf.bak profile services swid zincati
cifs-utils dnf grub2-efi.cfg kernel magic openldap profile.d sestatus.conf sysconfig
cni dracut.conf grub.d krb5.conf mke2fs.conf opt protocols shadow sysctl.conf
console-login-helper-messages dracut.conf.d gshadow krb5.conf.d modprobe.d os-release rc.d shadow- sysctl.d
containerd environment gshadow- ld.so.cache modules-load.d ostree redhat-release shells systemd
containers ethertypes gss ld.so.conf motd pam.d request-key.conf skel system-release
cron.d exports host.conf ld.so.conf.d motd.d passwd request-key.d ssh system-release-cpe
crypto-policies fedora-coreos-pinger hostname libaudit.conf mtab passwd- rpc ssl terminfo
[core@networktest ~]$ cat /run/initramfs/state/etc/resolv.conf
nameserver 168.63.129.16
search u5e2tmrol1sebjifwcberhsgzf.dx.internal.cloudapp.net
[core@networktest ~]$ journalctl -t NetworkManager --no-pager
-- Logs begin at Tue 2020-01-28 20:03:24 UTC, end at Tue 2020-01-28 21:28:33 UTC. --
Jan 28 20:04:35 networktest NetworkManager[1003]: <info> [1580241875.0456] NetworkManager (version 1.20.8-1.fc31) is starting... (for the first time)
Jan 28 20:04:35 networktest NetworkManager[1003]: <info> [1580241875.0459] Read config: /etc/NetworkManager/NetworkManager.conf (lib: 10-disable-default-plugins.conf, 20-client-id-from-mac.conf) (run: 10-dracut-dhclient.conf)
Jan 28 20:04:35 networktest NetworkManager[1003]: <info> [1580241875.6094] bus-manager: acquired D-Bus service "org.freedesktop.NetworkManager"
Jan 28 20:04:35 networktest NetworkManager[1003]: <info> [1580241875.6509] manager[0x56149f4c0130]: monitoring kernel firmware directory '/lib/firmware'.
Jan 28 20:04:37 networktest NetworkManager[1003]: <info> [1580241877.4818] hostname: hostname: using hostnamed
Jan 28 20:04:37 networktest NetworkManager[1003]: <info> [1580241877.4818] hostname: hostname changed from (none) to "networktest"
Jan 28 20:04:37 networktest NetworkManager[1003]: <info> [1580241877.4824] dns-mgr[0x56149f4a3240]: init: dns=default,systemd-resolved rc-manager=symlink
Jan 28 20:04:37 networktest NetworkManager[1003]: <info> [1580241877.5263] manager[0x56149f4c0130]: rfkill: Wi-Fi hardware radio set enabled
Jan 28 20:04:37 networktest NetworkManager[1003]: <info> [1580241877.5264] manager[0x56149f4c0130]: rfkill: WWAN hardware radio set enabled
Jan 28 20:04:37 networktest NetworkManager[1003]: <info> [1580241877.6368] manager: rfkill: Wi-Fi enabled by radio killswitch; enabled by state file
Jan 28 20:04:37 networktest NetworkManager[1003]: <info> [1580241877.6369] manager: rfkill: WWAN enabled by radio killswitch; enabled by state file
Jan 28 20:04:37 networktest NetworkManager[1003]: <info> [1580241877.6370] manager: Networking is enabled by state file
Jan 28 20:04:37 networktest NetworkManager[1003]: <info> [1580241877.6724] dhcp-init: Using DHCP client 'dhclient'
Jan 28 20:04:37 networktest NetworkManager[1003]: <info> [1580241877.6725] settings: Loaded settings plugin: keyfile (internal)
Jan 28 20:04:37 networktest NetworkManager[1003]: <info> [1580241877.6900] device (lo): carrier: link connected
Jan 28 20:04:37 networktest NetworkManager[1003]: <info> [1580241877.6903] manager: (lo): new Generic device (/org/freedesktop/NetworkManager/Devices/1)
Jan 28 20:04:37 networktest NetworkManager[1003]: <info> [1580241877.6910] device (eth0): carrier: link connected
Jan 28 20:04:37 networktest NetworkManager[1003]: <info> [1580241877.6914] manager: (eth0): new Ethernet device (/org/freedesktop/NetworkManager/Devices/2)
Jan 28 20:04:37 networktest NetworkManager[1003]: <info> [1580241877.7868] settings: (eth0): created default wired connection 'Wired connection 1'
Jan 28 20:04:37 networktest NetworkManager[1003]: <info> [1580241877.7913] device (eth0): state change: unmanaged -> unavailable (reason 'connection-assumed', sys-iface-state: 'external')
Jan 28 20:04:37 networktest NetworkManager[1003]: <info> [1580241877.7922] device (eth0): state change: unavailable -> disconnected (reason 'connection-assumed', sys-iface-state: 'external')
Jan 28 20:04:37 networktest NetworkManager[1003]: <info> [1580241877.7931] device (eth0): Activation: starting connection 'eth0' (4a30bc0c-48d3-49d2-a508-2edd429eaba7)
Jan 28 20:04:37 networktest NetworkManager[1003]: <info> [1580241877.8076] device (eth0): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'external')
Jan 28 20:04:37 networktest NetworkManager[1003]: <info> [1580241877.8080] device (eth0): state change: prepare -> config (reason 'none', sys-iface-state: 'external')
Jan 28 20:04:37 networktest NetworkManager[1003]: <info> [1580241877.8083] device (eth0): state change: config -> ip-config (reason 'none', sys-iface-state: 'external')
Jan 28 20:04:37 networktest NetworkManager[1003]: <info> [1580241877.8085] device (eth0): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'external')
Jan 28 20:04:37 networktest NetworkManager[1003]: <info> [1580241877.8195] device (eth0): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'external')
Jan 28 20:04:37 networktest NetworkManager[1003]: <info> [1580241877.8197] device (eth0): state change: secondaries -> activated (reason 'none', sys-iface-state: 'external')
Jan 28 20:04:37 networktest NetworkManager[1003]: <info> [1580241877.8200] manager: NetworkManager state is now CONNECTED_LOCAL
Jan 28 20:04:37 networktest NetworkManager[1003]: <info> [1580241877.8207] device (eth0): Activation: successful, device activated.
Jan 28 20:04:37 networktest NetworkManager[1003]: <info> [1580241877.8212] manager: NetworkManager state is now CONNECTED_GLOBAL
Jan 28 20:04:37 networktest NetworkManager[1003]: <info> [1580241877.8215] manager: startup complete
Jan 28 20:06:43 networktest NetworkManager[1003]: <info> [1580242003.3939] caught SIGTERM, shutting down normally.
Jan 28 20:06:43 networktest NetworkManager[1003]: <info> [1580242003.3953] manager: NetworkManager state is now CONNECTED_LOCAL
Jan 28 20:06:43 networktest NetworkManager[1003]: <info> [1580242003.4758] exiting (success)
Jan 28 20:06:43 networktest NetworkManager[2116]: <info> [1580242003.5124] NetworkManager (version 1.20.8-1.fc31) is starting... (after a restart)
Jan 28 20:06:43 networktest NetworkManager[2116]: <info> [1580242003.5125] Read config: /etc/NetworkManager/NetworkManager.conf (lib: 10-disable-default-plugins.conf, 20-client-id-from-mac.conf) (run: 10-dracut-dhclient.conf)
Jan 28 20:06:43 networktest NetworkManager[2116]: <info> [1580242003.5201] bus-manager: acquired D-Bus service "org.freedesktop.NetworkManager"
Jan 28 20:06:43 networktest NetworkManager[2116]: <info> [1580242003.5350] manager[0x555befdfe130]: monitoring kernel firmware directory '/lib/firmware'.
Jan 28 20:06:43 networktest NetworkManager[2116]: <info> [1580242003.8577] hostname: hostname: using hostnamed
Jan 28 20:06:43 networktest NetworkManager[2116]: <info> [1580242003.8578] hostname: hostname changed from (none) to "networktest"
Jan 28 20:06:43 networktest NetworkManager[2116]: <info> [1580242003.8580] dns-mgr[0x555befde3240]: init: dns=default,systemd-resolved rc-manager=symlink
Jan 28 20:06:43 networktest NetworkManager[2116]: <info> [1580242003.8583] manager[0x555befdfe130]: rfkill: Wi-Fi hardware radio set enabled
Jan 28 20:06:43 networktest NetworkManager[2116]: <info> [1580242003.8583] manager[0x555befdfe130]: rfkill: WWAN hardware radio set enabled
Jan 28 20:06:43 networktest NetworkManager[2116]: <info> [1580242003.8597] manager: rfkill: Wi-Fi enabled by radio killswitch; enabled by state file
Jan 28 20:06:43 networktest NetworkManager[2116]: <info> [1580242003.8598] manager: rfkill: WWAN enabled by radio killswitch; enabled by state file
Jan 28 20:06:43 networktest NetworkManager[2116]: <info> [1580242003.8599] manager: Networking is enabled by state file
Jan 28 20:06:43 networktest NetworkManager[2116]: <info> [1580242003.8600] dhcp-init: Using DHCP client 'dhclient'
Jan 28 20:06:43 networktest NetworkManager[2116]: <info> [1580242003.8601] settings: Loaded settings plugin: keyfile (internal)
Jan 28 20:06:43 networktest NetworkManager[2116]: <info> [1580242003.8615] device (lo): carrier: link connected
Jan 28 20:06:43 networktest NetworkManager[2116]: <info> [1580242003.8617] manager: (lo): new Generic device (/org/freedesktop/NetworkManager/Devices/1)
Jan 28 20:06:43 networktest NetworkManager[2116]: <info> [1580242003.8624] device (eth0): carrier: link connected
Jan 28 20:06:43 networktest NetworkManager[2116]: <info> [1580242003.8628] manager: (eth0): new Ethernet device (/org/freedesktop/NetworkManager/Devices/2)
Jan 28 20:06:43 networktest NetworkManager[2116]: <info> [1580242003.8648] device (eth0): state change: unmanaged -> unavailable (reason 'connection-assumed', sys-iface-state: 'external')
Jan 28 20:06:43 networktest NetworkManager[2116]: <info> [1580242003.8657] device (eth0): state change: unavailable -> disconnected (reason 'connection-assumed', sys-iface-state: 'external')
Jan 28 20:06:43 networktest NetworkManager[2116]: <info> [1580242003.8665] device (eth0): Activation: starting connection 'eth0' (794b0119-9912-453b-b991-246d38a41599)
Jan 28 20:06:43 networktest NetworkManager[2116]: <info> [1580242003.8678] device (eth0): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'external')
Jan 28 20:06:43 networktest NetworkManager[2116]: <info> [1580242003.8681] device (eth0): state change: prepare -> config (reason 'none', sys-iface-state: 'external')
Jan 28 20:06:43 networktest NetworkManager[2116]: <info> [1580242003.8684] device (eth0): state change: config -> ip-config (reason 'none', sys-iface-state: 'external')
Jan 28 20:06:43 networktest NetworkManager[2116]: <info> [1580242003.8686] device (eth0): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'external')
Jan 28 20:06:43 networktest NetworkManager[2116]: <info> [1580242003.8831] device (eth0): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'external')
Jan 28 20:06:43 networktest NetworkManager[2116]: <info> [1580242003.8833] device (eth0): state change: secondaries -> activated (reason 'none', sys-iface-state: 'external')
Jan 28 20:06:43 networktest NetworkManager[2116]: <info> [1580242003.8836] manager: NetworkManager state is now CONNECTED_LOCAL
Jan 28 20:06:43 networktest NetworkManager[2116]: <info> [1580242003.8841] device (eth0): Activation: successful, device activated.
Jan 28 20:06:43 networktest NetworkManager[2116]: <info> [1580242003.8845] manager: NetworkManager state is now CONNECTED_GLOBAL
Jan 28 20:06:43 networktest NetworkManager[2116]: <info> [1580242003.8847] manager: startup complete
likely because we aren't bringing down the networking in the initramfs like RHCOS is
Ahhh hmm yeah, that's a big delta. I don't quite remember now why we don't do this in FCOS too. Maybe we expected it to be unnecessary with the switch to NM in the initrd?
https://github.com/coreos/fedora-coreos-tracker/issues/148#issuecomment-565830139
Thats a showstopper because we have to reboot each machine a few times manually before it has an internet connection.
Is it a big task to resolve this in the official fcos image?
@jlebon I think https://github.com/coreos/ignition-dracut/issues/119 is related.
Hi folks. Do you have any updates for this one?
cross referencing this with https://github.com/coreos/fedora-coreos-tracker/issues/394
Hitting this issue as well. Any updates?
yes, once https://github.com/coreos/ignition-dracut/pull/159 and https://github.com/coreos/fedora-coreos-config/pull/310 are merged and into a release we think this should be taken care of.
Hi,
FYI, I used this in the ignition to work around the issue. Seems to be working:
systemd:
units:
- name: azure-restart-network.service
enabled: true
contents: |
[Service]
Type=oneshot
ExecStart=/bin/bash -c '\
/usr/bin/cp /run/initramfs/state/etc/resolv.conf /etc/resolv.conf; \
/usr/bin/systemctl restart NetworkManager'
[Install]
WantedBy=multi-user.target
@jomeier @simongottschlag - care to test https://builds.coreos.fedoraproject.org/prod/streams/testing-devel/builds/31.20200323.20.0/x86_64/fedora-coreos-31.20200323.20.0-azure.x86_64.vhd.xz to see if that fixes the problem?
@jomeier @simongottschlag - care to test https://builds.coreos.fedoraproject.org/prod/streams/testing-devel/builds/31.20200323.20.0/x86_64/fedora-coreos-31.20200323.20.0-azure.x86_64.vhd.xz to see if that fixes the problem?
I'm having issues deploying our production VMs right now (capacity in West Europe), meaning I need to prioritise that before tests. Sorry!
Strike!
I will try that out today. Give me a few hours, please.
@dustymabe @vrutkovs @LorbusChris
Ok guys ... it looks good.
I installed OKD 4.4 successfully without manual interaction from my side. Everything is green in the web ui -> ok.
For your information: I had to resize, convert and upload the FCOS test image to Azure but I'm sure thats expected behaviour for this test. I used a helper VM which I patched in the OKD installer which did the work.
Good job !
We are now using NetworkManager in the initramfs and also propagating network information from the initramfs (kargs) when appropriate, which we think fixes this issue.
See https://github.com/coreos/fedora-coreos-tracker/issues/394#issuecomment-604598128 and the preceding discussion for more details.
Issue Report
Bug
Fedora CoreOS Version
31.20200113.3.1
Expected Behavior
Working DNS
Actual Behavior
When spawning FCOS machines on Azure there is no DNS. The machines do seem to have working networking otherwise.
Reproduction Steps
Other Information
I haven't managed to get a machine booted on Azure via manual spawning in the CLI or kola that have working DNS.