microsoft / WSL

Issues found on WSL
https://docs.microsoft.com/windows/wsl
MIT License
17.44k stars 821 forks source link

[WSL1] [glibc] sleep: cannot read realtime clock: Invalid argument #4898

Closed hferreira23 closed 4 years ago

hferreira23 commented 4 years ago

Windows build number: Microsoft Windows [Version 10.0.18362.592] WSL version: 1 Linux distro: Arch WSL

Description:

I'm using Arch WSL in WSL 1 and after an update which bumped glibc from 2.30.3 to 2.31.1 the sleep command stopped working throwing out the error: sleep: cannot read realtime clock: Invalid argument

This affects namely as Ansible, as ansible runs sleep commands in multiple modules. In this issue, on the Arch WSL github (https://github.com/yuk7/ArchWSL/issues/108) , some problems with rust packages were also identified.

Downgrading to the previous version solves the issue. Now, I know that Arch WSL is not an official WSL implementation but this may actually happen when Ubuntu, etc, reach the aforementioned version of glibc

sirredbeard commented 4 years ago

Annotation 2020-04-27 131743

Guidance on this issue in Ubuntu on WSL1 is here: https://discourse.ubuntu.com/t/ubuntu-20-04-and-wsl-1/15291:

What is the plan?

For WSL 1 users, I recommend you sit tight on Ubuntu 18.04 for now. The patch for issue 4989 will take some time to be backported. Ubuntu 18.04 is an LTS release, short for long-term servicing, and is supported through 2023 so you will continue to get security patches and backports from Canonical in the meantime.

Workarounds include:

If you need to reset Ubuntu and are on WSL 1, you should switch to the Ubuntu 18.04 image.


I can confirm upgrading glibc from 2.30.3 to 2.31.1 on Arch on WSL breaks sleep.

Annotation 2020-02-14 180116

I used the Arch rootfs from the ArchWSL project: https://github.com/yuk7/ArchWSL/releases/download/20.2.7.0/Arch.zip

Imported it.

Confirmed 2.20.3 installed:

$ pacman -Q --info glibc

In /etc/pacman.conf changed SigLevel= to Never.

Ran updates:

$ pacman -Syu

Confirmed new version:

$ pacman -Q --info glibc

Tried $ sleep 10. Got came error.

strace attached. sleep10.txt

sirredbeard commented 4 years ago

It looks like sleep used to call:

nanosleep({tv_sec=10, tv_nsec=0}, NULL) = 0

sleep10-glibc2.30-3.txt

Appears that support for nanosleep was added in 14915.

But now calls:

clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=10, tv_nsec=0}, NULL)

sleep10-glibc2.31-1.txt

The line in sleep is here: https://github.com/coreutils/coreutils/blob/master/src/sleep.c#L142

if (xnanosleep (seconds))

Here are the release notes for glibc 2.31.1, they mention some changes related to clock issues but nothing jumps out as directly related: https://sourceware.org/ml/libc-announce/2020/msg00001.html

Also this may all technically be a duplicate of https://github.com/microsoft/WSL/issues/2503

therealkenc commented 4 years ago

Also this may all technically be a duplicate of #2503

Most def. Apologies I saw this submission and knew it was dupe #2503 earlier in the afternoon, but got pulled away before I could hit the button. Thanks for going down the rabbit hole.

I'll forward-dupe #2503 as a (probably bad precedent) bump. If this is in glibc it'll hit distros in the store soon enough. Presumably eating CLOCK_REALTIME but providing CLOCK_MONOTONIC semantics in the WSL1 driver would be a pretty small change.

sirredbeard commented 4 years ago

Between glibc 2.30 and 2.31 they add the following to /posix/nanosleep.c:

{
  int ret = __clock_nanosleep (CLOCK_REALTIME, 0, requested_time, remaining);
  if (ret != 0)
    {
      __set_errno (ret);
      return -1;
    }
  return 0;
}
therealkenc commented 4 years ago

Right, this.

sirredbeard commented 4 years ago

Here an interesting related email thread:

POSIX.1 specifies that nanosleep() should measure time against the
   CLOCK_REALTIME clock.
therealkenc commented 4 years ago

Yeah. Ref.

dbz2k commented 4 years ago

I was wondering would issues like these get backported to with cumulative updates, or I would I have to wait till the next window version.

sirredbeard commented 4 years ago

I spoke with one of the senior engineers on our foundations team and they do plan to land glibc 2.31, with the problematic change, in Ubuntu Focal 20.04, our next release in April.

This one would be a useful to have backported unless it can be fixed upstream.

dbz2k commented 4 years ago

sorry for commenting again I was wondering if there were going to have wait till the next window version to get it fixed?

SvenGroot commented 4 years ago

As @therealkenc surmised, the issue is the lack of CLOCK_REALTIME support in WSL1's clock_nanosleep. We have a fix working internally, which will make its way to insiders. We will also look into backporting this to older Windows versions, but I can't make any promises or provide a timeline for this, unfortunately.

kailiu42 commented 4 years ago

I believe this is the problem that caused 100% CPU utilization on a single core for the cron process on my system. It used to run well until a few days ago the cron daemon process suddenly always on 100% CPU core usage. I checked my system it indeed has glibc upgraded to 2.31 on Feb 20. Then I straced cron process, it keeps calling clock_nanosleep(CLOCK_REALTIME, ...) indefinitely, might be the problem that caused high CPU utilization.

ChrisTX commented 4 years ago

I've gotten 2.31 to work fine on my Arch install by recompiling the package with a patch to __clock_nanosleep replacing CLOCK_REALTIME with CLOCK_MONOTONIC:

diff -ruN glibc-2.31/sysdeps/unix/sysv/linux/clock_nanosleep.c glibc-2.31-b/sysdeps/unix/sysv/linux/clock_nanosleep.c
--- glibc-2.31/sysdeps/unix/sysv/linux/clock_nanosleep.c    2020-02-01 12:52:50.000000000 +0100
+++ glibc-2.31-b/sysdeps/unix/sysv/linux/clock_nanosleep.c  2020-03-05 23:51:05.856886500 +0100
@@ -31,7 +31,8 @@
                           struct __timespec64 *rem)
 {
   int r;
-
+  if (clock_id == CLOCK_REALTIME)
+     clock_id = CLOCK_MONOTONIC;
   if (clock_id == CLOCK_THREAD_CPUTIME_ID)
     return EINVAL;
   if (clock_id == CLOCK_PROCESS_CPUTIME_ID)

If anyone's feeling adventurous, feel free to try this out.

therealkenc commented 4 years ago

I've gotten 2.31 to work fine on my Arch install by recompiling the package with a patch

Close. You've got to be careful about TIMER_ABSTIME in flags. A straight-up swap of CLOCK_REALTIME for CLOCK_MONOTONIC probably isn't (quite) enough. [This is possibly academic for the quick fix; no one sane uses TIMER_ABSTIME. It might not even matter IRL, it would take some test cases to confirm.]

ChrisTX commented 4 years ago

I've gotten 2.31 to work fine on my Arch install by recompiling the package with a patch

Close. You've got to be careful about TIMER_ABSTIME in flags. A straight-up swap of CLOCK_REALTIME for CLOCK_MONOTONIC probably isn't (quite) enough. [This is possibly academic for the quick fix; no one sane uses TIMER_ABSTIME. It might not even matter IRL, it would take some test cases to confirm.]

That's a valid point, but clock_nanosleep(CLOCK_REALTIME, ...) has always been broken on WSL, and that includes with TIMER_ABSTIME. It's easily fixable by subtracting the current CLOCK_REALTIME value from the absolute timestamp and dropping the TIMER_ABSTIME flag. I've expanded the patch to that end, but I really doubt this has any practical impact or use. I've uploaded the patch as a Gist.

saizai commented 4 years ago

This breaks Ubuntu Focal also.

# uname -a
Linux [redacted] 4.4.0-18362-Microsoft #476-Microsoft Fri Nov 01 16:53:00 PST 2019 x86_64 x86_64 x86_64 GNU/Linux

# wslsys
Release Install Date: 0x5ceea999
Branch: 19h1_release
Build: 18363
Full Build: 18362.1.amd64fre.19h1_release.190318-1202
Uptime: 6d 4h 6m
Linux Release: Ubuntu Focal Fossa (development branch)
Linux Kernel: Linux 4.4.0-18362-Microsoft
Packages Count: 1367

# apt upgrade
Reading package lists... Done
Building dependency tree
Reading state information... Done
You might want to run 'apt --fix-broken install' to correct these.
The following packages have unmet dependencies:
 libc-bin : Depends: libc6 (< 2.31) but 2.31-0ubuntu5 is installed
 libc-dev-bin : Depends: libc6 (< 2.31) but 2.31-0ubuntu5 is installed
 libc6-dev : Depends: libc6 (= 2.30-0ubuntu3) but 2.31-0ubuntu5 is installed
 locales : Depends: libc-bin (> 2.31) but 2.30-0ubuntu3 is installed
E: Unmet dependencies. Try 'apt --fix-broken install' with no packages (or specify a solution).

# apt --fix-broken install
Reading package lists... Done
Building dependency tree
Reading state information... Done
Correcting dependencies... Done
The following additional packages will be installed:
  libc-bin libc-dev-bin libc6-dev libcrypt-dev
Suggested packages:
  glibc-doc
The following NEW packages will be installed:
  libcrypt-dev
The following packages will be upgraded:
  libc-bin libc-dev-bin libc6-dev
3 upgraded, 1 newly installed, 0 to remove and 131 not upgraded.
2 not fully installed or removed.
Need to get 0 B/3331 kB of archives.
After this operation, 44.0 kB disk space will be freed.
Do you want to continue? [Y/n]
Setting up libc6:amd64 (2.31-0ubuntu5) ...
Checking for services that may need to be restarted...
Checking init scripts...
Nothing to restart.
sleep: cannot read realtime clock: Invalid argument
dpkg: error processing package libc6:amd64 (--configure):
 installed libc6:amd64 package post-installation script subprocess returned error exit status 1
Errors were encountered while processing:
 libc6:amd64
E: Sub-process /usr/bin/dpkg returned an error code (1)
harryqt commented 4 years ago

Ubuntu Focal Fossa (20.04) also broken

After this operation, 8192 B of additional disk space will be used.
Do you want to continue? [Y/n] y
Setting up libc6:amd64 (2.31-0ubuntu6) ...
Checking for services that may need to be restarted...
Checking init scripts...
Nothing to restart.
sleep: cannot read realtime clock: Invalid argument
dpkg: error processing package libc6:amd64 (--configure):
 installed libc6:amd64 package post-installation script subprocess returned error exit status 1
Errors were encountered while processing:
 libc6:amd64
E: Sub-process /usr/bin/dpkg returned an error code (1)
Conduitry commented 4 years ago

I've run into this as well on Ubuntu 20.04, and I'm currently using

sudo apt-mark hold libc6

to prevent libc from being updated to the version that breaks in WSL 1.

danyer commented 4 years ago

This breaks Ubuntu Focal also.

# uname -a
Linux [redacted] 4.4.0-18362-Microsoft #476-Microsoft Fri Nov 01 16:53:00 PST 2019 x86_64 x86_64 x86_64 GNU/Linux

# wslsys
Release Install Date: 0x5ceea999
Branch: 19h1_release
Build: 18363
Full Build: 18362.1.amd64fre.19h1_release.190318-1202
Uptime: 6d 4h 6m
Linux Release: Ubuntu Focal Fossa (development branch)
Linux Kernel: Linux 4.4.0-18362-Microsoft
Packages Count: 1367

# apt upgrade
Reading package lists... Done
Building dependency tree
Reading state information... Done
You might want to run 'apt --fix-broken install' to correct these.
The following packages have unmet dependencies:
 libc-bin : Depends: libc6 (< 2.31) but 2.31-0ubuntu5 is installed
 libc-dev-bin : Depends: libc6 (< 2.31) but 2.31-0ubuntu5 is installed
 libc6-dev : Depends: libc6 (= 2.30-0ubuntu3) but 2.31-0ubuntu5 is installed
 locales : Depends: libc-bin (> 2.31) but 2.30-0ubuntu3 is installed
E: Unmet dependencies. Try 'apt --fix-broken install' with no packages (or specify a solution).

# apt --fix-broken install
Reading package lists... Done
Building dependency tree
Reading state information... Done
Correcting dependencies... Done
The following additional packages will be installed:
  libc-bin libc-dev-bin libc6-dev libcrypt-dev
Suggested packages:
  glibc-doc
The following NEW packages will be installed:
  libcrypt-dev
The following packages will be upgraded:
  libc-bin libc-dev-bin libc6-dev
3 upgraded, 1 newly installed, 0 to remove and 131 not upgraded.
2 not fully installed or removed.
Need to get 0 B/3331 kB of archives.
After this operation, 44.0 kB disk space will be freed.
Do you want to continue? [Y/n]
Setting up libc6:amd64 (2.31-0ubuntu5) ...
Checking for services that may need to be restarted...
Checking init scripts...
Nothing to restart.
sleep: cannot read realtime clock: Invalid argument
dpkg: error processing package libc6:amd64 (--configure):
 installed libc6:amd64 package post-installation script subprocess returned error exit status 1
Errors were encountered while processing:
 libc6:amd64
E: Sub-process /usr/bin/dpkg returned an error code (1)

I "fixed" it somehow on my install. Because the error is in the postinstall script of libc6 package, I simply edited /var/lib/dpkg/info/libc6\:amd64.postinst and transformed "sleep 1" in "echo sleep 1". Then I run, as advised, 'apt --fix-broken install' and it managed to fix the dependencies. I removed then my change to the postinstall script. "apt update" and "apt full-upgrade" seems to be working fine after that.

Of course this is not a fix and I think I did it worse: all calls to sleep will fail. But it was fun and Ubuntu WSL1 is not my main driver so I had nothing to lose ;)

kailiu42 commented 4 years ago

Will this be fixed for WSL1 or we can only count on WSL2?

harryqt commented 4 years ago

The latest build of Ubuntu 20.04 seems to be working fine without any issue in WSL1.

SvenGroot commented 4 years ago

@kraml A fix is coming to WSL1, and we're looking into backporting it to currently released Windows versions.

sirredbeard commented 4 years ago

Thank you @SvenGroot.

Sp1l commented 4 years ago

The latest build of Ubuntu 20.04 seems to be working fine without any issue in WSL1.

That's not my experience. On a clean install that might work, but upgrading from 19.10 to 20.04 will not work due to

sudo do-release-upgrade -d
...
sleep: cannot read realtime clock: Invalid argument

and would revert to the old install. I worked around this by replacing /bin/sleep with /bin/echo instead (I know this is not OK) which would allow the upgrade to succeed. Reinstalling coreutils or reverting that replacement is an issue still and sleep 1 will return sleep: cannot read realtime clock: Invalid argument

harryqt commented 4 years ago

That's not my experience. On a clean install that might work,

I actually did a clean install, maybe that is why it worked for me.

rirze commented 4 years ago

While the installation might succeed, the actual library is nonfunctional. Attempting to build certain libraries fails because the WSL instruction to sleep, as noted above, is absent.

I too have done a clean install of 20.04 and while it is fresh on updates, glibc is not functional. I had to do a clean install of 19.10 to get the latest working version. Until the team can patch this, reverting the package is the only way to ensure a fully functional distribution.

VinnieCool commented 4 years ago
sudo mv /bin/sleep /bin/sleep~
touch /bin/sleep
chmod +x /bin/sleep
5p0ng3b0b commented 4 years ago

Symlink to busybox sleep seems to be the best workaround.

mv /bin/sleep /bin/sleep~
mv /usr/bin/sleep /usr/bin/spleep~
ln -s /usr/bin/busybox /bin/sleep
ln -s /usr/bin/busybox /usr/bin/sleep

Continue as usual

rafaeldtinoco commented 4 years ago

WSL1 Ubuntu 20.04 related bug: https://bugs.launchpad.net/bugs/1871129

pagerc commented 4 years ago

For anyone that attempts an upgrade (from latest LTS bionic) and is struggling to recover the system after libc gets b0rked (libcrypt.so.1 errors), try this: curl -kLO http://mirrors.kernel.org/ubuntu/pool/main/g/glibc/libc6_2.27-3ubuntu1_amd64.deb # download latest libc6 for your release ar x libc6_2.27-3ubuntu1_amd64.deb # upack the release tar -xf data.tar.xz -C / # extract the libc6 over your partially upgraded libc6 apt install libc6/bionic # do a reinstall using the actual libc6 for your release apt-mark hold libc6 # hold the package until wsl fixes this issue apt update # update the apt database apt upgrade # upgrade your packages to the latest version (except for held packages) apt --fix-install # repair any broken installs apt autoremove # cleanup

orlando-oli commented 4 years ago

tar -xf data.tar.xz -C /

as soon as I did that, my WSL broke completely

pagerc commented 4 years ago

tar -xf data.tar.xz -C /

as soon as I did that, my WSL broke completely

Did you try to go in via powershell? Use wsl -u root to get a root bash shell. If that won't run, then you probably want to look at getting busybox-static installed to recover the environment.

rafaeldtinoco commented 4 years ago

For Ubuntu Users.. I have created a PPA:

https://launchpad.net/~rafaeldtinoco/+archive/ubuntu/lp1871129

(just uploaded, might take sometime to build the pkg)

with @ChrisTX's patch... but I'm not considering this as an Ubuntu final fix for the issue as it appears to me it should be fixed/mitigated by WSL layer. The WSL1 syscall interception layer could do the clock type change instead of changing glibc for this corner case.

Nevertheless, anyone facing the issue can add that PPA before upgrading to focal and clock_nanosleep() calls shouldn't case issues.

A testcase: To run "htop" and check if it works =).

rafaeldtinoco commented 4 years ago
rafaeldtinoco@wsl1:~$ sudo apt-mark hold libc6
libc6 set on hold.

rafaeldtinoco@wsl1:~$ dpkg -l libc6
hi  libc6:amd64    2.31-0ubuntu8+lp1871129~1 amd64        GNU C Library: Shared libraries

Not sure how much time this bug will take so make sure to mark libc6 as "hold" so it does not get upgraded. The version I'm using: 2.31-0ubuntu8+lp1871129~1 will be the latest for today but not if some other glibc stable release lands in the archive.

I confirm "htop" works good with this mitigation.

o/

moisespr123 commented 4 years ago

I installed today libc6_2.31-0ubuntu7_amd64.deb and it seems to had been installed with no realtime clock issues.

therealkenc commented 4 years ago

Not sure how much time this bug will take it seems to had been installed with no realtime clock issues.

image

jamesbroadhead commented 4 years ago

Workaround for Ubuntu 20.04

sudo vi /var/lib/dpkg/info/libc6:amd64.postinst 

- telinit u 2>/dev/null || true ; sleep 1
+ telinit u 2>/dev/null || true ; # sleep 1

sudo dpkg --configure -a

edit: looks like this was suggested by @danyer already - but I missed it when this problem cropped up the second time for me

awson commented 4 years ago

image

Which insider build is this fixed in? Will it land in 2004 release?

iccfish commented 4 years ago

image

Which insider build is this fixed in? Will it land in 2004 release?

I'm runnings WSL1 on 19608.1006(lastest insider build in fast ring), seems sleep 1 works ok.

harryqt commented 4 years ago

Symlink to busybox sleep seems to be the best workaround.

htop still does not work.

iccfish commented 4 years ago

Symlink to busybox sleep seems to be the best workaround.

htop still does not work.

I'm runnings WSL1 on 19608.1006(lastest insider build in fast ring), seems htop works ok 😄

Kagami commented 4 years ago

htop works fine for me (stable version) after https://github.com/microsoft/WSL/issues/5125#issuecomment-619350931

mdragosv commented 4 years ago

confirming, also imposible to complete sudo apt install mysql-server seems now the default is 8.0 🗡️

arisboch commented 4 years ago

confirming, also imposible to complete sudo apt install mysql-server seems now the default is 8.0 🗡️

Did you try using aptitude instead of apt?

mdragosv commented 4 years ago

confirming, also imposible to complete sudo apt install mysql-server seems now the default is 8.0 🗡️

Did you try using aptitude instead of apt?

Why would you do that, i ended up installing the ubuntu 18.04 from the windows store, until the may update comes to release.

The best workaround was @5p0ng3b0b fix with the busybox sleep

arisboch commented 4 years ago

confirming, also imposible to complete sudo apt install mysql-server seems now the default is 8.0 🗡️

Did you try using aptitude instead of apt?

Why would you do that, i ended up installing the ubuntu 18.04 from the windows store, until the may update comes to release.

The best workaround was @5p0ng3b0b fix with the busybox sleep

Because aptitude has the feature of offering multiple ways of how to install stuff without breaking dependencies (e.g. holding packages back and stuff).

TurnOffNOD commented 4 years ago

@5p0ng3b0b 's busybox workaround works for me. And theoretically this would be the best workaround.

pwang2 commented 4 years ago

@TurnOffNOD just curious, where is the busybox come from? I am in the middle of the br0ken and did not see any busybox binaries.

angelog0 commented 4 years ago

I found this workaround (with sudo):

mv /bin/sleep /bin/sleep~  # timestamp about 2018
touch /bin/sleep
chmod +x /bin/sleep

apt update
apt upgrade

but I had to apply it 2 times because after the first time it failed with another package in the same manner. Indeed the empty /bin/sleep' file was substituted by another (timestamp Sept. 05 2019). Now I have this assleep` command. After this all seems to work and (I am on WSL1, W10 Pro 64 10.0.18363.778)

lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 20.04 LTS
Release:        20.04
Codename:       focal

Really will it work?...

Suppose I want to reinstall from scratch (I have the list of packages I installed) what should I do? Just uninstalling Ubuntu (from Windows Apps settings)? Would it remove the rootfs directory too? or what else?

Note that about four months ago I moved the Ubuntu installation with the wsl --export.. and friends tools. Now it is installed under Users\utente\Ubuntu

Thanks.

iBug commented 4 years ago

I solved it by downloading and manually installing the patched package:

https://launchpad.net/~rafaeldtinoco/+archive/ubuntu/lp1871129/+build/19152555/+files/libc6_2.31-0ubuntu8+lp1871129~1_amd64.deb

Other files from the same repository can be found here: https://launchpad.net/~rafaeldtinoco/+archive/ubuntu/lp1871129/+build/19152555

It's still @rafaeldtinoco 's PPA after all but saves some mess adding APT repositories or so.

wget "https://launchpad.net/~rafaeldtinoco/+archive/ubuntu/lp1871129/+build/19152555/+files/libc6_2.31-0ubuntu8+lp1871129~1_amd64.deb"
sudo dpkg -i libc6_2.31-0ubuntu8+lp1871129~1_amd64.deb

Thanks Rafael and Christian!