Open croemmich opened 9 years ago
Well, unlike ntpd timesyncd will only use one upstream server instead of ntpd's 4. You can try configuring timesyncd to use a local ntp server if you aren't already or if using a remote server pick a single one to use for all your nodes, maybe the issue is simply syncing with different reference times. Beyond that I'm not sure. We do still ship ntpd so you can switch back too.
https://coreos.com/docs/cluster-management/setup/configuring-date-and-timezone/
Thanks @marineam, I'll give your suggestions a try. I have already switched back to ntpd which solved the issues, but if timesyncd is the future of CoreOS, I'd like to get it working. However, it would be ideal if it functioned the same without additional user configuration.
Thanks. One advantage of timesyncd is being integrated with networkd so if dhcp provides a local ntp server it will be used. The issue may simply be a bug in timesyncd, it is relatively new code. If it turns out the issue is due to only using a single remote timeserver out of the default pool we may need to revisit the choice of timesyncd.
@marineam I'd prefer to not have to deploy a local ntp server. I tried specifying a single remote server for all of the nodes and they still fell out of sync. Regardless, neither option sits well with me, as my current deployment is completely HA and using a single ntp server breaks that.
Ok, I would recommend switching back to ntpd for now then. I'll dig more to see if timesyncd can be improved or if we need to change the default again. I promise ntpd will remain in the image. :)
https://coreos.com/docs/cluster-management/setup/configuring-date-and-timezone/
Cool, thanks!
(You can check if systemd-timesyncd has synchronized (assuming you are actually using it and not some other time sync server that implements the timedate DBUS interface such as chrony) by looking at the NTP synchronized: line of the timedatectl output)
@croemmich have you seen any improvement in the later versions of CoreOS?
Closing due to inactivity.
Closing due to inactivity.
If more activity is created, would this be re-opened?
@ramayer If this is still causing problems, we can reopen. What behavior are you seeing?
@bgilbert I have been running ceph on CoreOS stable for over a year now and this problem has never gone away. I just now found this issue in a web search. I can provide whatever debugging would be helpful.
As @croemmich described, systemd-timesyncd does not seem to set clocks correctly.
Like @croemmich I also switched back to ntpd (which made the problem go away) when I found this page with his workaround.
I run Deis on CoreOS and recently made the switch to 681, which swapped ntpd for systemd-timesyncd as the default time sync daemon. Deis uses a Ceph to create HA filesystem within the cluster. Ceph has an expectation that times within the cluster are very close, see: http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-mon/#clock-skews.
Before the switch, Ceph never reported any issues, but after the switch 2/3 of my Ceph monitors were reporting clock skew issues. I verified that systemd-timesyncd was indeed running, but I couldn't find any indication of when/how it was syncing.
Is there a difference in the way systemd-timesyncd works, does it sync less frequently, is it just less accurate, or is there something I need to configure to get the nodes more in sync?