Closed craigloewen-msft closed 9 months ago
@patricklangsonos https://github.com/pmartincic/WSL2-Linux-Kernel-1006/tree/bug Or I can attach my built vmlinux if that's easier.
that works - building now. Thanks!
@pmartincic hmm... so I'm not too sure what to make of this yet.
I installed the patched kernel, checked that it was logging, then put the laptop to sleep around the 55 second mark:
[ 40.426885] hv_utils: TimeSync flags 2
[ 45.427707] hv_utils: TimeSync flags 2
[ 49.235805] hv_balloon: Max. dynamic memory size: 15782 MB
[ 49.381759] nf_conntrack: default automatic helper assignment has been turned off for security reasons and CT-based firewall rule not found. Use the iptables CT target to attach helpers instead.
[ 50.444825] hv_utils: TimeSync flags 2
[ 55.454717] hv_utils: TimeSync flags 2
[ 60.467737] hv_utils: TimeSync flags 2
According to powercfg /systempowerreport
it was asleep at 2023-10-10 12:49:52
for 0:07:25
dmesg doesn't show any gap in time or changes to flags:
[ 49.235805] hv_balloon: Max. dynamic memory size: 15782 MB
[ 49.381759] nf_conntrack: default automatic helper assignment has been turned off for security reasons and CT-based firewall rule not found. Use the iptables CT target to attach helpers instead.
[ 50.444825] hv_utils: TimeSync flags 2
[ 55.454717] hv_utils: TimeSync flags 2
[ 60.467737] hv_utils: TimeSync flags 2
[ 65.479252] hv_utils: TimeSync flags 2
[ 70.481898] hv_utils: TimeSync flags 2
[ 74.819696] hv_utils: TimeSync flags 2
[ 79.841598] hv_utils: TimeSync flags 2
[ 84.847034] hv_utils: TimeSync flags 2
[ 89.856718] hv_utils: TimeSync flags 2
[ 94.861737] hv_utils: TimeSync flags 2
[ 99.864990] hv_utils: TimeSync flags 2
[ 104.873360] hv_utils: TimeSync flags 2
[ 109.884524] hv_utils: TimeSync flags 2
[ 114.899976] hv_utils: TimeSync flags 2
[ 119.903046] hv_utils: TimeSync flags 2
[ 124.918444] hv_utils: TimeSync flags 2
[ 129.926908] hv_utils: TimeSync flags 2
[ 134.929134] hv_utils: TimeSync flags 2
[ 139.934084] hv_utils: TimeSync flags 2
[ 144.934618] hv_utils: TimeSync flags 2
[ 149.949686] hv_utils: TimeSync flags 2
[ 154.961049] hv_utils: TimeSync flags 2
running ntpdate manually after that corrected the time offset of +417.937004 sec
@patricklangsonos, Sweet/Unsweet! Much appreciated. Thanks for confirming. That looks like the buggy behavior I expect to see. Ideally when this is fixed, you'd see a line hv_utils: TimeSync flags 1
@patricklangsonos, Sweet/Unsweet! Much appreciated. Thanks for confirming. That looks like the buggy behavior I expect to see. Ideally when this is fixed, you'd see a line
hv_utils: TimeSync flags 1
Is 3 ( ICTIMESYNCFLAG_SYNC & ICTIMESYNCFLAG_SAMPLE
) valid, or should that be two separate messages (1 followed by 2) on vmbus?
As written I don't think the TimeSync IC is capable of sending that message. I'd be surprised if you see it? Academically: I think 3 would effectively be treated as 1.
On a working system I'd expect to see 2,2,2,2,1,2,2,2
My other laptop (Lenovo X13 Yoga, Intel CPU) drifted this time too. I don't recall if I checked it before or not. After 55 mins of sleep, that only logged 2's. No 1's were logged since near startup.
# dmesg tail
[ 842.481543] hv_utils: TimeSync flags 2
[ 847.491300] hv_utils: TimeSync flags 2
[ 852.503636] hv_utils: TimeSync flags 2
[ 857.506893] hv_utils: TimeSync flags 2
[ 862.516027] hv_utils: TimeSync flags 2
[ 867.529813] hv_utils: TimeSync flags 2
$ dmesg | grep hv_utils | grep -v 'flags 2'
[ 0.325884] hv_utils: Registering HyperV Utility Driver
[ 0.326142] hv_vmbus: registering driver hv_utils
[ 0.326671] hv_utils: TimeSync IC version 4.0
[ 0.327797] hv_utils: TimeSync flags 1
$ sudo ntpdate time.windows.com
10 Oct 16:00:08 ntpdate[248]: step time server 168.61.215.74 offset 3288.309764 sec
@pmartincic Okay... I see where you're going. The system is always sending SAMPLE and because it never performs any kind of restore/resume event beyond startup, it never sends a SYNC command, so do_settimeofday64 is never called.
In my kernel I commented the //if (adj_flags & ICTIMESYNCFLAG_SYNC)
statement in hv_util.c entirely and also momentarily disabled the hv_get_adj_host_time() threshold such that adjust is always called and the clock now updates perfectly on resume.
Would it be horrific to ignore that line for WSL (this IS just for WSL, anyway) such that the time is always set? Note that if this were true, the time would still only be adjusted at all if the threshold in hv_get_adj_host_time() were exceeded, which means the clock isn't ALWAYS being adjusted... but it also means that there would still be some allowed drift in the clock before the reset would occur, which should probably be communicated or possibly put into a user's conf file so they can increase precision vs. performance.
The best answer would still be to gain access to the s2idle event... but how to do that without having the rest of the kernel react to it would be interesting.
EDIT: Apparently systemd-resolved doesn't like time adjustments too frequently.
I reverted back to S3 by switching the Lenovo T14 AMD to "Linux" in BIOS power management settings. It seems S3 works as expected with WSL. Whether or not Windows drivers and Lenovo peripheral works is still TBD. The dock is flaky for sure now.
#dmesg
[ 156.593468] hv_utils: TimeSync flags 1
[ 161.601413] hv_utils: TimeSync flags 2
[ 166.617042] hv_utils: TimeSync flags 2
$ sudo ntpdate time.windows.com
10 Oct 18:18:52 ntpdate[1178]: adjust time server 168.61.215.74 offset -0.410780 sec
Checking journal:
Oct 10 18:13:58 LENOVO-T14 kernel: hv_utils: TimeSync flags 2
Oct 10 18:14:04 LENOVO-T14 kernel: hv_utils: TimeSync flags 2
Oct 10 18:14:04 LENOVO-T14 kernel: hv_utils: TimeSync IC version 4.0
Oct 10 18:14:04 LENOVO-T14 kernel: hv_utils: TimeSync flags 1
Oct 10 18:16:37 LENOVO-T14 systemd-resolved[211]: Clock change detected. Flushing caches.
Oct 10 18:16:42 LENOVO-T14 kernel: hv_utils: TimeSync flags 2
Oct 10 18:16:47 LENOVO-T14 kernel: hv_utils: TimeSync flags 2
Oct 10 18:16:52 LENOVO-T14 kernel: hv_utils: TimeSync flags 2
Oct 10 18:16:57 LENOVO-T14 kernel: hv_utils: TimeSync flags 2
Oct 10 18:17:02 LENOVO-T14 kernel: hv_utils: TimeSync flags 2
Oct 10 18:17:05 LENOVO-T14 chronyd[349]: Forward time jump detected!
I keep reading... Apparently if Windows detects your machine is even capable of modern standby, it won't allow S3 (per ACPI 5.0 spec)...
Windows supports platforms that have low-power S0-idle capability regardless of whether they implement hardware-reduced ACPI or full ACPI. However, as required by the ACPI 5.0 specification, Windows does not use traditional sleep/resume features on platforms that have low-power S0-idle capability, regardless of ACPI configuration.
I'm guessing for modern Windows hosts (and that really is what we're talking about), S3 is blocked from even being an option.
Yeah, I'm not surprised that it was deeply hidden in the BIOS. I'm going to run with S3 for a while to see if its acceptable mainly because I miss hybrid sleep. Nothing like travelling or commuting and opening a laptop that died in S0 without saving your work
Nothing like travelling or commuting and opening a laptop that died in S0 without saving your work
I hear the dead battery part, but it should hibernate if it goes low battery in S0 idle...
Modern standby PCs supporting Win32 apps must also support Hibernate. | Hibernate is used to save the state of desktop/Win32 applications when critical-low battery capacity is reached.
...not to say that every vendor does everything perfectly, but...
My other laptop (Lenovo X13 Yoga, Intel CPU) drifted this time too. I don't recall if I checked it before or not. After 55 mins of sleep, that only logged 2's. No 1's were logged since near startup.
@PatrickLang
What are the available power states on that device? powercfg /a
I assume this was S0.*
Yup, S3 worked in all my testing. Just the most recent power states are problematic.
Acknowledging that this shouldn't be used as a discussion forum, but a) this is the best display of open source collaboration I've seen in years and b) something is seriously wrong from an architectural perspective with power management if the predilections of the host box's BIOS are leaking through to a VM hosted by an OS running on said box.
Note that if this were true, the time would still only be adjusted at all if the threshold in hv_get_adj_host_time() were exceeded, which means the clock isn't ALWAYS being adjusted... but it also means that there would still be some allowed drift in the clock before the reset would occur, which should probably be communicated or possibly put into a user's conf file so they can increase precision vs. performance.
@cmullendore, I like this more than the other workarounds I've seen. Was thinking of something similar. Hard reset of time if differential exceeds some threshold. It's the closest to the desired/correct behavior and can be done in a minimally invasive fashion. And the wrong clock would be present for 5 seconds or less on resume.
We're going to have to ship a workaround in any case just due to the time it will take a fix to propagate. And workarounds are what I'm currently mulling over while other internal discussions of how to notify hyper-v about modern standby continue in the background.
@dboreham, regarding b)
, this one is a little more complicated behind the curtain, than how it appears in front of the curtain.
@pmartincic It also gets you out of the waiting game as the various linux kernel drivers slowly (or even if they) adopt modern standby aware implementations.
@pmartincic
I'm currently mulling over while other internal discussions of how to notify hyper-v about modern standby
I was considering thoughts like that too. There is one thing that is always initiated by the host OS and can be used to not only verify the life/lifetime of the guest to the host, but can also deliver an insight to the guest as well: Heartbeat.
If the heartbeat included a timestamp of the true time from the host and the heartbeat timestamp could be a shared resource, you could trigger whatever "oh we must have just come back" processes that would be valuable too.
Finally, thanks for tolerating as I dig through my first OS kernel in a language I don't know (I'm a C# person). Much appreciation. :)
My other laptop (Lenovo X13 Yoga, Intel CPU) drifted this time too. I don't recall if I checked it before or not. After 55 mins of sleep, that only logged 2's. No 1's were logged since near startup.
@PatrickLang What are the available power states on that device?
powercfg /a
I assume this was S0.*Yup, S3 worked in all my testing. Just the most recent power states are problematic.
Yes, my Lenovo X13 Yoga (Intel CPU) is using S0.
I think I mixed up cause & effect investigating this for a few reasons:
@pmartincic - got a specific related question here. Would this missing time sync message on S0 resume affect both VMMS-managed VMs, and HCS-managed VMs (WSL2, WSA)?
If it affects both, I might lean towards a user-mode solution such as a modification to systemd-timesyncd or chronyd configurations because that might be easier to deploy than suggesting all distros that might run under Hyper-V (managed by VMMS) to take a kernel patch. Those could be shipped as an optional package or included in one such as Ubuntu's cloud-tools which already wraps up some of the Hyper-V integration services today.
@PatrickLang, I think it would be true for both. Haven't tested it. Some management stacks have accidentally avoided this issue by handling modern suspend on their own, for other reasons.
🦉
Well I'm happy to report that I've gone 2 days, multiple sleeps on both AC & DC power, and no clock drift once I switched over to using S3 sleep. I don't think that is the right solution for everyone, but it confirms that my problem was related to S0.
@patricklangsonos Thats not actually surprising. S3 triggers the ACPI resume functions that adjust the clock on resume. Most drivers seem to be okay with S3, likely because it’s been a common sleep mode for much longer than modern standby.
True - most drivers support S3 but I'm not 100% sure if it's propagated to VMs or not. I know the ACPI implementation was updated for VMMS-managed VMs to reflect battery status to Linux VMs, but am not 100% clear on what S-states are implemented. For a long time (and possibly still today) VMs were paused (instruction counter stops, memory still pinned to VM) in some events or saved (instruction counter stops, memory contents written to disk and freed) instead of allowing the guest VM to handle its own events. I guess I need to go read the latest TLFS :) https://learn.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/tlfs
@patricklangsonos The hypervisor does pass down the S states if the kernel exposes the correct APIs and flags to indicate it has the features to support those states. Otherwise, you’re correct… it simply pauses the VM behind the VM’s back and doesn’t tell the VM that it happened… it has no mechanism to do so.
I believe the reason this is an issue is because Microsoft’s documentation in modern standby is that if a machine supports modern standby, only modern standby and hibernate will be used. S3 won’t even be considered. So you end up with a hypervisor that will ONLY support one S state with a guest that doesn’t support that S state and so HV simply pauses the VM. Hyper-V appears to only pass down the current S state as indicated by the hardware. That means that if the host is going into modern standby (S0 low power), HV is incapable of saying to the guest “oh, but you can’t do S0, so I’ll give you S3 instead”. It doesn’t work that way.
Hence this situation.
As a note, I decided to see if modern standby might be "more" supported in the latest kernel... 6.5... and the first item that prevented me from making it to modern standby was the virtio-fs driver in the fuse-fs. Nope. Still not supported.
This workaround is likely to be long-lived.
More ammo that this may be at least somewhat related to other high CPU/freeze issues in WSL:
https://github.com/microsoft/WSL/issues/6982#issuecomment-1766877469
@cmullendore, does this work on your setup? https://github.com/pmartincic/WSL2-Linux-Kernel-1006/tree/testClockSync
@pmartincic Lemme pull and build and I'll get back to you. I've been playing with moving to the 6.5.0 kernel to see if it does PM any better. TBD. Redirecting to your branch.
@pmartincic Please ignore previous post. ID10T moment... I was filtering your stuff out in the grep. Apologies. Moving forward.
@pmartincic Behaves as expected. I un-commented and added some additional log lines. you can see where the VM paused and the time then resyncing.
Oct 17 21:26:57 Ubuntu kernel: [ 186.785365] hv_utils: TimeSync - Current {1697603217, 52041164}; New {1697603217, 51971100}
Oct 17 21:26:57 Ubuntu kernel: [ 186.785369] hv_utils: TimeSync - hv_implicit_sync result: 0
Oct 17 21:26:57 Ubuntu kernel: [ 186.785370] hv_utils: TimeSync - Current {1697603217, 52046748}; New {1697603217, 51971100}
Oct 17 21:27:00 Ubuntu kernel: [ 190.305418] hv_utils: TimeSync flags 2
Oct 17 21:27:00 Ubuntu kernel: [ 190.305481] hv_utils: TimeSync - Current {1697603220, 572143373}; New {1697603222, 54825800}
Oct 17 21:27:00 Ubuntu kernel: [ 190.305485] hv_utils: TimeSync - hv_implicit_sync result: 0
Oct 17 21:27:00 Ubuntu kernel: [ 190.305487] hv_utils: TimeSync - Current {1697603220, 572149758}; New {1697603222, 54825800}
Oct 17 21:27:04 Ubuntu kernel: [ 194.433540] hv_utils: TimeSync flags 2
Oct 17 21:37:37 Ubuntu kernel: [ 194.433596] hv_utils: TimeSync - Current {1697603224, 700242389}; New {1697603857, 66094700}
Oct 17 21:37:37 Ubuntu kernel: [ 194.433601] hv_utils: TimeSync - force sync
Oct 17 21:37:37 Ubuntu kernel: [ 194.433604] hv_utils: TimeSync - hv_implicit_sync result: 1
Oct 17 21:37:37 Ubuntu kernel: [ 194.433606] hv_utils: TimeSync - Current {1697603224, 700253042}; New {1697603857, 66094700}
Oct 17 21:37:37 Ubuntu kernel: [ 194.433610] hv_utils: TimeSync - force sync
Oct 17 21:37:42 Ubuntu kernel: [ 199.430952] hv_utils: TimeSync flags 2
Oct 17 21:37:42 Ubuntu kernel: [ 199.431002] hv_utils: TimeSync - Current {1697603862, 63537283}; New {1697603862, 63509500}
Oct 17 21:37:42 Ubuntu kernel: [ 199.431006] hv_utils: TimeSync - hv_implicit_sync result: 0
Is this going to play well with 309,12: static int hv_get_adj_host_time(struct timespec64 *ts)
, which has its own threshold as to when it will/won't adjust the clock? Is there a point at which the clock will be so skewed that despite your code, the adj_host_time method will simply refuse, and is this desirable behavior?
Thanks for all of your work. :)
Edit: I moved your changes into my build of a WSL 6.5.0 kernel and it works there too. Just in case you were wondering in the infinite amount of free time you have. 😂
Oct 18 03:29:57 Ubuntu kernel: [ 1.491841] hv_utils: TimeSync IC version 4.0
Oct 18 03:29:59 Ubuntu systemd[1]: Starting Time & Date Service...
Oct 18 03:29:59 Ubuntu systemd[1]: Started Time & Date Service.
Oct 18 03:30:00 Ubuntu systemd[462]: Reached target Timers.
Oct 18 03:32:03 Ubuntu kernel: [ 136.076387] hv_utils: TimeSync - force sync
Oct 18 03:37:35 Ubuntu kernel: [ 141.076502] hv_utils: TimeSync - force sync
^C
chris@Ubuntu:~$ uname -a
Linux Ubuntu 6.5.0WSL2-microsoft-standard-WSL2+ #9 SMP PREEMPT_DYNAMIC Wed Oct 18 02:52:46 PDT 2023 x86_64 x86_64 x86_64 GNU/Linux
chris@Ubuntu:~$
@cmullendore, thanks for the smoke test!
Regarding hv_get_adj_host_time
: Do you see something I don't? My understanding is that would cause the update to fail only if the work item executed more than 10 minutes after it was scheduled, math questionable. I don't think that has to do with how far of a time sync is attempted? It compares hv_read_reference_counter
from two different points in time.
@pmartincic I'm going to fully acknowledge that while I can solidly code, the number of similarly named variables and math and constants makes that very difficult for me to grok... and we're bumping into the limits of my C knowledge.
It depends on what happens on line 343...
if (!hv_get_adj_host_time(&ts))
do_settimeofday64(&ts);
If the drift is beyond the threshold then the return will be -ESTALE (116). I'm not clear if the non-zero means DO execute the do_settimeofday64()
or not. It's a positive non-zero, so that would evaluate to true, right? So then if we're beyond 10 mins, we'll get back from hv_get_adj_host_time()
then !(true) would be false and the time adjustment would never occur...?
Again, this snippet is foiling my head and you're probably right.
+1 on this comment means it's worth keeping. -1 on this comment means I'm clearly wrong in my understanding of that code and I'll delete this comment because misinformation is bad. :)
Also, your code has been keeping my current WSL instance very happy. Only side effect was a sudden need for resolv to flush the dns cache. I'm willing to live with that. I have my vmIdleTimeout set to forever, so this drift is a real issue for me and I'm enjoying it no longer being an issue.
I'm really happy that you're closing in on a solution because my time is now drifting forward for the first time:
I did a windows update yesterday, it just noticed this for the first time today. sudo hwclock -s
not fixing it anymore. Going for a reboot now.
@dhensen, when sudo hwclock -s
doesn't work what do you see as the output from sudo hwclock; date; cmd.exe /C time /T
?
@dhensen Is this primarily so that you can adjust clock drift? ...or do you have a specific use case in which you NEED the guest clock to be different from the host/hardware?
This is the first time I've heard of the guest clock being AHEAD of the host OS... not sure what that could be.
@pmartincic Looking at your code, I don't believe it would fix the guest clock at all if the guest is AHEAD of the host. Is this something that should be accommodated?
Thanks for pointing out that was ahead, I didn't see that.
@cmullendore, I don't think that's a good idea. If/when the hypervisor side of this gets addressed, it may end up doing that on resume. If someone/something puts their clock that far ahead, I'd rather not touch it.
And to clarify, we have chronyd
running in the root namespace, visible from wsl --debug-shell
. That would be handling drift. It is not configured to gracefully handle random discontinuities in time. If you see time going forward, and chronyd
isn't running, please file a bug. This issue isn't about drift, it's about the VM getting left behind on resume.
@pmartincic Totally... and to that point, there may actually be scenarios where people actually want the clock to intentionally be behind too... maybe testing some time sensitive feature or something. On one hand, I can see the response being that such a use case might possibly be outside of the scope of WSL... Use a VM and disable clock sync. On the other, perhaps a flag in wsl.conf or .wslconfig to indicate that the skew fix not actually happen?
Regardless, I think you've got the best fix that can happen at this point in time. I've continued to research and even in 6.5.0 the PCI drivers depend on virtio which are still explicitly not allowing s2idle. I'm guessing that linux on hardware is doing S3 and not s2idle at all and that's how they're getting around the virtio driver issue. Even if the virtio drivers aren't used on metal, the hv guest drivers demand a dependency on PCI which demands virtio. No alternative.
Ship it. :D
@pmartincic Forward time drift happened only once and I was baffled by it. I'm closely watching this thread, and all of a sudden now I had forward drift, happened only once though, I'll keep an eye on it and report it is happens again, because this time it happened on/right after a windows update that required a reboot, might have tripped some edge case.
@cmullendore I have no special use case, I just want my WSL2 time to be in sync with laptop clock.
Thanks for all your efforts.
@pmartincic @cmullendore It looks like you fixed it already? I noticed that this issue did come to my mind for for a whole week at least... No idea why but it is no longer an issue on my machine, what could have fixed this? I'm hoping it was your efforts ;)
The clock issue is still happening. Every time I put my computer to sleep (S0 - Modern Standby), my WSL2 instance clock falls behind.
We haven't fixed it yet. Working on getting a patch in is a little more hassle than to be desired. I will post when we ship with an updated kernel.
Thanks. I knew you were still working on it. I was replying to the prior comment which seemed to imply it was fixed.
We haven't fixed it yet. Working on getting a patch in is a little more hassle than to be desired. I will post when we ship with an updated kernel.
Any update?
In my wsl2, even the output of sudo hwclock -r is wrong, so I can not set my wsl time to hwclock to fix this problem.
In my wsl2, even the output of sudo hwclock -r is wrong, so I can not set my wsl time to hwclock to fix this problem.
Until it's fixed, I've been (and will keep) using
sudo ntpdate time.windows.com
Interestingly, I'm able to immediately notice when my WSL2 guest has drifted (e.g., during sleep) when I execute a Jupyter Notebook cell in VS Code and the runtime counter just below starts above 0.0s. In fact, its starting value is always the the correct amount of drift in minutes--seconds as reported by ntpdate
's output.
In my wsl2, even the output of sudo hwclock -r is wrong, so I can not set my wsl time to hwclock to fix this problem.
Do this from PowerShell
wsl -u root date -s $(date)
Or, just don't put your computer to sleep.
A patch has gone into the maintainer branch. It will be back ported to our kernel.
@habaohaba I'm assuming that you're saying that the host clock doesn't agree with what you see from hwclock
? That would be a different issue, please open a new bug for that.
@pmartincic Please keep us posted on when this is released or prereleased so we can give it a try 🙏
Even slightest drifts seem to mess up everything, e.g. Cloudflare's wrangler
. Worst, you don't know why things stop working b/c of unrelated errors all over the logs. Worst2, you forgot again and again about the time drift and waste your time for imaginary bugs.
@pmartincic Thanks for your support, hope we get here more frequent updates, an ETA and a final solution with a perfect time sync very soon. WSL2 is the reason we code on Windows but this time drift makes it unbearable.
Megathread
Current status: waiting on backport for kernel patch to mitigate issue.
We're creating this megathread to track the clock skew issues in WSL in one place, and will keep this parent comment current with any updates.
Background
Sometimes the WSL clock can become skewed after resume from sleep (specifically S0). See some example related issues for more info: https://github.com/microsoft/WSL/issues/8318 https://github.com/microsoft/WSL/issues/8204 https://github.com/microsoft/WSL/issues/7255
Potential work arounds
Use systemd to force clock sync
See this comment: https://github.com/microsoft/WSL/issues/8204#issuecomment-1338334154
Set the hardware clock via a command
Run
sudo hwclock -s
. More info here.Run
ntpdate
on distro start upEdit /etc/wsl.conf to have this content:
This will force a clock reset on start up of the distro.
Build a private kernel with this patch