Open aminasyan opened 5 years ago
Can you post the contents of the mount unit (either a handwritten one or the one that gets generated)?
core@coreos ~ $ cat /etc/systemd/system/home.mount [Unit] Description=NFS Mount of /share Requires=network-online.target After=network-online.target
[Mount] What=nfs.example.com:/data Where=/home Type=nfs Options=nofail,x-systemd.device-timeout=60s,intr,hard,nfsvers=3,timeo=600,proto=tcp,retrans=2
[Install] WantedBy=remote-fs.target
This seems to be a transient process. The DNS resolution falls back to normal when systemd-timesync retries, after that all DNS queries are fine (don’t have the double suffix) but unfortunately NFS has already failed, as you can see from the tcpdump output.
I have tried with a VM and that works flawless. It could be because the real hardware is slow at initializing network than a VM?
Issue Report
On boot CoreOS fails to mount NFS share in /etc/fstab (have tried with systemd.mount) Reason given is "Failed to resolve server"
Upon examination of DNS packets discovered DNS suffix is appended twice (nfs.example.com.example.com)
Bug
Container Linux Version
Environment
What hardware/cloud provider/hypervisor is being used to run Container Linux? local hardware system, Supermicro SuperServer 6016TT-IBQF
Expected Behavior
NFS mount works on boot.
Actual Behavior
NFS mount fails on boot
Reproduction Steps
Other Information
Logs.
coreos ~ # journalctl -e Aug 08 18:38:46 coreos.example.com systemd-networkd-wait-online[706]: ignoring: lo Aug 08 18:38:46 coreos.example.com systemd-networkd-wait-online[706]: ignoring: lo Aug 08 18:38:46 coreos.example.com systemd-networkd-wait-online[706]: ignoring: lo Aug 08 18:38:46 coreos.example.com systemd-timesyncd[735]: Network configuration changed, trying to establish connection. Aug 08 18:38:46 coreos.example.com systemd[1]: rkt-gc.service: Succeeded. Aug 08 18:38:46 coreos.example.com systemd[1]: Started Garbage Collection for rkt. Aug 08 18:38:46 coreos.example.com update_engine[763]: I0808 18:38:46.914912 763 main.cc:89] CoreOS Update Engine starting Aug 08 18:38:46 coreos.example.com systemd[1]: Started Update Engine. Aug 08 18:38:46 coreos.example.com systemd[1]: Started Cluster reboot manager. Aug 08 18:38:46 coreos.example.com update_engine[763]: I0808 18:38:46.952653 763 update_check_scheduler.cc:74] Next update chec> Aug 08 18:38:47 coreos.example.com locksmithd[808]: Reboot strategy is "off" - locksmithd is exiting. Aug 08 18:38:47 coreos.example.com systemd[1]: locksmithd.service: Succeeded. Aug 08 18:38:48 coreos.example.com systemd-networkd[620]: enp1s0f0: Gained IPv6LL Aug 08 18:39:00 coreos.example.com systemd-networkd[620]: enp1s0f0: Configured Aug 08 18:39:00 coreos.example.com systemd-networkd-wait-online[706]: ignoring: lo Aug 08 18:39:00 coreos.example.com systemd[1]: Started Wait for Network to be Configured. Aug 08 18:39:00 coreos.example.com systemd[1]: Reached target Network is Online. Aug 08 18:39:00 coreos.example.com systemd[1]: home.mount: Directory /home to mount over is not empty, mounting anyway. Aug 08 18:39:00 coreos.example.com systemd[1]: Mounting /home... Aug 08 18:39:17 coreos.example.com mount[814]: mount.nfs: Failed to resolve server nfs.example.com: Name or service not known Aug 08 18:39:17 coreos.example.com systemd[1]: home.mount: Mount process exited, code=exited, status=32/n/a Aug 08 18:39:17 coreos.example.com systemd[1]: home.mount: Failed with result 'exit-code'. Aug 08 18:39:17 coreos.example.com systemd[1]: Failed to mount /home. Aug 08 18:39:17 coreos.example.com systemd[1]: Dependency failed for Remote File Systems. Aug 08 18:39:17 coreos.example.com systemd[1]: remote-fs.target: Job remote-fs.target/start failed with result 'dependency'. Aug 08 18:39:17 coreos.example.com systemd[1]: Starting Permit User Sessions... Aug 08 18:39:17 coreos.example.com systemd[1]: Started Permit User Sessions. Aug 08 18:39:17 coreos.example.com systemd[1]: Started Serial Getty on ttyS0. Aug 08 18:39:17 coreos.example.com systemd[1]: Started Getty on tty1. Aug 08 18:39:17 coreos.example.com systemd[1]: Reached target Login Prompts. Aug 08 18:39:17 coreos.example.com systemd[1]: Reached target Multi-User System. Aug 08 18:39:17 coreos.example.com systemd[1]: Startup finished in 4.259s (kernel) + 4.999s (initrd) + 40.602s (userspace) = 49.8> Aug 08 18:39:18 coreos.example.com systemd-timesyncd[735]: Synchronized to time server for the first time 198.58.105.63:123 (2.co> Aug 08 18:39:33 coreos.example.com update_engine[763]: I0808 18:39:33.170766 763 update_attempter.cc:493] Updating boot flags... Aug 08 18:39:42 coreos.example.com systemd[1]: Created slice system-sshd.slice. Aug 08 18:39:42 coreos.example.com systemd[1]: Started OpenSSH per-connection server daemon (10.64.19.203:58386). Aug 08 18:39:42 coreos.example.com sshd[836]: Accepted publickey for core from 10.64.19.203 port 58386 ssh2: RSA SHA256:ChUbqqydY> Aug 08 18:39:42 coreos.example.com sshd[836]: pam_unix(sshd:session): session opened for user core by (uid=0) Aug 08 18:39:42 coreos.example.com systemd[1]: Created slice User Slice of UID 500. Aug 08 18:39:42 coreos.example.com systemd[1]: Starting User Runtime Directory /run/user/500... Aug 08 18:39:42 coreos.example.com systemd-logind[764]: New session 1 of user core. Aug 08 18:39:42 coreos.example.com systemd[1]: home.mount: Directory /home to mount over is not empty, mounting anyway. Aug 08 18:39:42 coreos.example.com systemd[1]: Mounting /home... Aug 08 18:39:42 coreos.example.com systemd[1]: Started User Runtime Directory /run/user/500. Aug 08 18:39:42 coreos.example.com systemd[1]: Starting User Manager for UID 500... Aug 08 18:39:42 coreos.example.com systemd[842]: pam_unix(systemd-user:session): session opened for user core by (uid=0) Aug 08 18:39:42 coreos.example.com systemd[842]: Reached target Sockets. Aug 08 18:39:42 coreos.example.com systemd[842]: Reached target Paths. Aug 08 18:39:42 coreos.example.com systemd[842]: Reached target Timers. Aug 08 18:39:42 coreos.example.com systemd[842]: Reached target Basic System. Aug 08 18:39:42 coreos.example.com systemd[842]: Reached target Default. Aug 08 18:39:42 coreos.example.com systemd[842]: Startup finished in 53ms. Aug 08 18:39:42 coreos.example.com systemd[1]: Started User Manager for UID 500. Aug 08 18:39:42 coreos.example.com mount[839]: mount.nfs: Failed to resolve server nfs.example.com: Name or service not known Aug 08 18:39:42 coreos.example.com systemd[1]: home.mount: Mount process exited, code=exited, status=32/n/a Aug 08 18:39:42 coreos.example.com systemd[1]: home.mount: Failed with result 'exit-code'. Aug 08 18:39:42 coreos.example.com systemd[1]: Failed to mount /home. Aug 08 18:39:42 coreos.example.com systemd[1]: Dependency failed for Session 1 of user core.
Tcpdump on DNS server
[root@dns ~]# tcpdump -i eno1 host coreos.example.com and port 53 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eno1, link-type EN10MB (Ethernet), capture size 262144 bytes
11:39:17.879855 IP coreos.example.com.43126 > dns.example.com.domain: 50726+ A? nfs.example.com.example.com. (45) 11:39:17.879884 IP coreos.example.com.46795 > dns.example.com.domain: 50834+ A? 1.coreos.pool.ntp.org.example.com. (51) 11:39:17.879915 IP coreos.example.com.46795 > dns.example.com.domain: 32414+ AAAA? 1.coreos.pool.ntp.org.example.com. (51) 11:39:17.880032 IP dns.example.com.domain > coreos.example.com.46795: 50834 NXDomain 0/0/0 (51) 11:39:17.880050 IP dns.example.com.domain > coreos.example.com.46795: 32414 NXDomain 0/0/0 (51) 11:39:17.881057 IP coreos.example.com.35273 > dns.example.com.domain: 37928+ A? 2.coreos.pool.ntp.org. (39) 11:39:17.881088 IP coreos.example.com.35273 > dns.example.com.domain: 51246+ AAAA? 2.coreos.pool.ntp.org. (39) 11:39:17.957679 IP dns.example.com.domain > coreos.example.com.43126: 50726 NXDomain 0/1/0 (102) 11:39:18.052238 IP dns.example.com.domain > coreos.example.com.35273: 37928 4/9/14 A 198.58.105.63, A 45.76.244.193, A 206.55.191.142, A 184.105.182.15 (477) 11:39:18.053921 IP dns.example.com.domain > coreos.example.com.35273: 51246 4/9/13 AAAA 2607:7c80:55:1005::254, AAAA 2600:3c00::f03c:91ff:fe91:b509, AAAA 2600:3c03::f03c:91ff:fe3e:c3bb, AAAA 2a0d:5600:33:b::1 (509)
^C 10 packets captured 10 packets received by filter 0 packets dropped by kernel [root@dns ~]#