Closed sharnoff closed 8 months ago
Notes from discussion:
In the case of KVM virtual machines, running ntpd in the guest OS to synchronize and syntonize its clock to the host one may be a suboptimal solution due to the usage of simple software skb timestamps in the virtio net driver - when combined with the non-deterministic nature of hypervisor time stealing, this can lead to increased guest clock offset and frequency jitter.
Using solutions provided by QEMU (i.e. setting time periodically via qemu-guest-agent or using its PV RTC clock to periodically set the CLOCK_REALTIME clock in a custom agent) may provide less jitter but may lead to problems related to the incorrect behavior of various software in an environment with frequent realtime clock steps (in my experience, some programs can't handle time "jumps" at all; this may lead to timed events being missed because the required time value has been "cut out" from the CLOCK_REALTIME time axis by the aforementioned jump).
There are other time synchronization solutions that can be recommended in this use case (in increasing precision order):
refclock PHC
device configured). As the kvm-ptp driver acquires host time directly via a hypervisor call, this solution will reduce time stealing-related jitter. This solution is recommended by Red Hat Inc. to achieve time synchronization accuracy in order of ten microsecondsstatus: blocked on review
Environment
Production
Steps to reproduce
Presumably, leave a VM running for a long time — haven't yet validated.
Expected result
The clock in the VM should not significantly drift — at minimum, it shouldn't be off by more than 0.1s, but ideally it'd remain within 1ms. There may be difficulties in practice from cgroup CPU throttling.
Actual result
On startup, the clock is roughly synchronized, but we've seen noticeable differences in the past (IIRC, 0.2s), and a user recently reported a more significant difference - ref https://neondb.slack.com/archives/C04DGM6SMTM/p1703259018496809.
Other logs, links