GoogleCloudPlatform / guest-agent

Apache License 2.0
130 stars 80 forks source link

Clock skew Debian 10 #100

Open AdriVanHoudt opened 3 years ago

AdriVanHoudt commented 3 years ago

Hi,

I've noticed some clock skew on compute engine instances running debian 10 (debian-10-buster-v20200805). Running /sbin/hwclock --hctosys -u --noadjfile fixed the issue. But I can't seem to find a conclusive answer as to how the image is syncing the clock. As far as I understand now the guest agent runs the clock sync on start and migration? But there is nothing that runs the sync periodically to keep the clock in sync? I don't see anything that suggest ntp is configured. (https://cloud.google.com/compute/docs/instances/managing-instances#configure_ntp_for_your_instances talks about setting it up but I get the feeling that is more for custom images?) Setting up my own cronjob to run the sync seems weird as I'd expect the image to do this out of the box. Am I missing something? Is there something that can mess with running the sync (like startupscript, apt upgrades, shielded vm or...)? This did not seem to be an issue with debian 9 images.

zmarano commented 3 years ago

Hi! An NTP client is setup on all the public base images for this purpose. On Debian 10, this is chrony. You can find the configuration in /etc/chrony/chrony.conf. The chrony service is enabled by default as well.

systemctl status chrony
● chrony.service - chrony, an NTP client/server
   Loaded: loaded (/lib/systemd/system/chrony.service; enabled; vendor preset: enabled)
   Active: active (running) since Wed 2021-02-17 21:49:53 UTC; 4 days ago
     Docs: man:chronyd(8)
           man:chronyc(1)
           man:chrony.conf(5)
  Process: 392 ExecStart=/usr/sbin/chronyd $DAEMON_OPTS (code=exited, status=0/SUCCESS)
  Process: 398 ExecStartPost=/usr/lib/chrony/chrony-helper update-daemon (code=exited, status=0/SUCCESS)
 Main PID: 394 (chronyd)
    Tasks: 2 (limit: 4665)
   Memory: 2.0M
   CGroup: /system.slice/chrony.service
           ├─394 /usr/sbin/chronyd -F -1
           └─395 /usr/sbin/chronyd -F -1
AdriVanHoudt commented 3 years ago

Oh nice, thank you.

I'm seeing

systemctl status chrony
● chrony.service - chrony, an NTP client/server
   Loaded: loaded (/lib/systemd/system/chrony.service; enabled; vendor preset: enabled)
   Active: inactive (dead)
     Docs: man:chronyd(8)
           man:chronyc(1)
           man:chrony.conf(5)

And /var/log/chrony is empty. Running systemctl restart chrony seems to have fixed it. But no idea why it was inactive in the first place 🤔

AdriVanHoudt commented 3 years ago

I think I found the issue. https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=947936 There is a conflict between chrony and systemd-timesyncd. Now there seems to be a fix for systemd-timesyncd to fix this but this requires a newer version of systemd which doesn't seem like an out of the box solution atm.

I don't know if this is the correct place to report it then but it would be great if the base debian 10 image would get fix for this as I expect a lot of people just running the image don't immediately notice this (it also doesn't happen on every boot). Let me know if I need to report this somewhere else.

zmarano commented 3 years ago

/assign @hopkiw