nestybox / sysbox

An open-source, next-generation "runc" that empowers rootless containers to run workloads such as Systemd, Docker, Kubernetes, just like VMs.
Apache License 2.0
2.73k stars 150 forks source link

Add support for running Sysbox on Windows (via WSL2) #32

Closed doggy8088 closed 8 months ago

doggy8088 commented 4 years ago

I'm using Docker for Windows with WSL 2 integration. Can I use sysbox? Do you have any guidance about this environment?

rodnymolina commented 4 years ago

@doggy8088 thanks for giving Sysbox a shot.

As you may have noticed from our documentation, at the moment we only support native linux deployments. WSL2 & Ubuntu is on our roadmap but we haven't spent any cycles on it yet. If you decide to give it a try, please let us know how it works for you, that would be really helpful.

felipecrs commented 4 years ago

I'm not able to install sysbox, since the WSL kernel is older than the required.

felipecrs commented 4 years ago

This might help: https://github.com/microsoft/WSL2-Linux-Kernel/issues/82

bittelc commented 4 years ago

@braedongough plz review

felipecrs commented 3 years ago

News in this one:

Microsoft released the WSL2 Linux 5.4 Kernel, but the sysbox installation still doesn't succeeds:

❯ uname -r
5.4.72-microsoft-standard-WSL2
❯ sudo apt install ./sysbox_0.2.1-0.ubuntu-focal_amd64.deb
Reading package lists... Done
Building dependency tree
Reading state information... Done
Note, selecting 'sysbox' instead of './sysbox_0.2.1-0.ubuntu-focal_amd64.deb'
sysbox is already the newest version (0.2.1-0.ubuntu-focal).
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
1 not fully installed or removed.
After this operation, 0 B of additional disk space will be used.
Do you want to continue? [Y/n] y
Setting up sysbox (0.2.1-0.ubuntu-focal) ...
sysctl: cannot stat /proc/sys/kernel/unprivileged_userns_clone: No such file or directory
dpkg: error processing package sysbox (--configure):
 installed sysbox package post-installation script subprocess returned error exit status 255
Errors were encountered while processing:
 sysbox
E: Sub-process /usr/bin/dpkg returned an error code (1)
ctalledo commented 3 years ago

Thanks @felipecrs , good news that WSL2 is not on Linux 5.4. We will take a look to see if we can support WSL2 before the upcoming release (mid-Feb).

sysctl: cannot stat /proc/sys/kernel/unprivileged_userns_clone: No such file or directory

Looks like the installer is failing because it assumes it's on a Ubuntu machine, so it's looking for the /proc/sys/kernel/unprivileged_userns_clone sysctl. But that sysctl is not present in WSL2 apparently, so it fails.

rodnymolina commented 3 years ago

Hey @felipecrs, thanks for looking into this one.

You just uncovered an interesting scenario. You are attempting to install Sysbox over Ubuntu, but the kernel being utilized underneath is not an Ubuntu kernel, but a MSFT's one. That breaks a few assumptions we made in our installer. For example, the installer expects Ubuntu distros to have /proc/sys/kernel/unprivileged_userns_clone file present to prove that user-namespace's kernel feature is available, which is a Sysbox requirement.

I just verified that MSFT's kernel do enable unprivileged user-namespaces feature. However, they make use of a different approach followed by non-Ubuntu/Debian distros, which relies on the use of this file instead: /proc/sys/user/max_user_namespaces.

The fix for this one will be to add some logic to the Sysbox installer to identify the WSL2 setup and act accordingly to prevent the error you described above.

felipecrs commented 3 years ago

@rodnymolina sounds very promising! Hopefully, that's the only needed change.

felipecrs commented 3 years ago

Ok, more inputs to this topic:

➜ wsl --list --verbose
  NAME                   STATE           VERSION
* Ubuntu                 Running         2
  docker-desktop-data    Running         2
  docker-desktop         Running         2
rodnymolina commented 3 years ago

@felipecrs, thanks for pointing that out, we will take it into account.

Now, please help me understand the use-case that you have in mind for Sysbox within WSL2 so that we prioritize this accordingly ...

Are you interested in the 'security' angle or you care more about the possibility of running 'docker-in-docker' setups? If it's the former, why is 'security' a concern within a personal development/testing environment? If it's the latter, can't docker-desktop + wsl2 run DIND with the typical 'privileged' container?

Thanks!

felipecrs commented 3 years ago

Oh, it's not for security reasons... I don't care about it in my personal environment.

My reasoning is: to have an environment in which I can test sysbox before deploying it in production. For example, I have an image which I use in a Jenkins to spawn disposable workers to run my builds (and Jenkins spawns them within a Kubernetes cluster).

https://github.com/felipecrs/jenkins-agent-dind

So, let's say I want to now make Jenkins use sysbox to deploy my workers. The first step is to configure my image (perhaps adding systemd, whatever). In order to do so, I need a development environment to edit the Dockerfile and call docker build and then docker run --runc sysbox to test it, which then my WSL2 enters in.

It's my dev env, which I would like to use for testing sysbox before deploying into production (Kubernetes).


But of course, I could spawn a new Ubuntu virtual machine here and use it as dev env instead. That's just harder and less productive than using WSL2.

rodnymolina commented 3 years ago

Thanks for the explanation @felipecrs, got it.

felipecrs commented 2 years ago

I'm not using Docker Desktop anymore, as they changed their licensing model. Instead, I installed the Docker Daemon directly on my Ubuntu 20.04 distro in WSL. So nevermind my concerns about running under Docker Desktop (which creates sidecars distributions and so).

I believe the first step would be to make it work without Docker Desktop, and then later, if possible, additionally support Docker Desktop.

The installation of the latest version of sysbox fails with:

#  https://github.com/microsoft/WSL2-Linux-Kernel
❯ uname -r
5.10.60.1-microsoft-standard-WSL2

❯ sudo apt install ./sysbox-ce_0.4.1-0.ubuntu-focal_amd64.deb -y
Reading package lists... Done
Building dependency tree
Reading state information... Done
Note, selecting 'sysbox-ce' instead of './sysbox-ce_0.4.1-0.ubuntu-focal_amd64.deb'
sysbox-ce is already the newest version (0.4.1-0.ubuntu-focal).
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
1 not fully installed or removed.
After this operation, 0 B of additional disk space will be used.
Setting up sysbox-ce (0.4.1-0.ubuntu-focal) ...

Your OS does not include the shiftfs module. Make sure to configure the container manager (e.g., Docker, CRI-O, etc) to use the Linux user-namespace when creating containers with Sysbox. Refer to Sysbox installation documentation for details.

sysctl: cannot stat /proc/sys/kernel/unprivileged_userns_clone: No such file or directory
dpkg: error processing package sysbox-ce (--configure):
 installed sysbox-ce package post-installation script subprocess returned error exit status 255
Errors were encountered while processing:
 sysbox-ce
E: Sub-process /usr/bin/dpkg returned an error code (1)

Which is exactly the same as before, and @rodnymolina already elaborated.

Perhaps the sysbox installer could be made less aware of the distro which it was made for, so that a single .deb could be shipped that would work for either Debian, Ubuntu, and even the handle the situation which @rodnymolina described:

I just verified that MSFT's kernel do enable unprivileged user-namespaces feature. However, they make use of a different approach followed by non-Ubuntu/Debian distros, which relies on the use of this file instead: /proc/sys/user/max_user_namespaces.

Regardless, in no way I think this is a high priority issue.

adamgit commented 2 years ago

Now, please help me understand the use-case that you have in mind for Sysbox within WSL2 so that we prioritize this accordingly ...

Are you interested in the 'security' angle or you care more about the possibility of running 'docker-in-docker' setups? If it's the former, why is 'security' a concern within a personal development/testing environment? If it's the latter, can't docker-desktop + wsl2 run DIND with the typical 'privileged' container?

privileged container = so many problems, bugs, nightmares (avoiding all this mess is why we use sysbox/nestybox!) docker-deskttop + wsl2 = proprietary, confusing, hard to maintain.

sysbox + WSL2 = cross-platform development seamlessly for windows + linux + OSX teams.

The case I often run into is: teams that are using Docker to manage development. They have a set of dockerfiles, setup, etc - and that all needs wrapping in a container. I need to dockerize their docker. Without that wrap ... every developer is hand-maintaining everything in the local dev environment ("do I have the same version of Docker installed as my colleague? No? Oh! Damn! Now he/she can't build any more because I made a change that only works with my local version of Docker!") etc.

sysbox has enabled me to pick those setups up (done it a few times now), wrap them inside a suitably configured sysbox container, republish to the team - and suddenly all 'works for me' errors disappear. Now I can guarantee we're all start/stopping everything with the same versions, same dependencies, and same OS. It's also reduced the number of "works in development, but not in production" bugs.

I'm still evaluating sysbox for full production usage - running it locally for as much of my personal dev as possible, seeing if it works (so far, bar a few teething problems, it's done great).

...but then today I had to work on Windows, and sysbox wouldn't work :(. So for today's development I've fallen back to hand-maintained configuration and scripts (like coding in the dark ages).

ctalledo commented 2 years ago

Hi @adamgit,

Thank you very much for the useful feedback, and apologies for Sysbox not working on WSL2 yet.

We definitely want to enable Sysbox on WSL2, but have been swamped with other work. Let me sync-up with @rodnymolina to see if we can get this going and hopefully can deliver something in January 2022.

Regarding:

I need to dockerize their docker

Yes, this is one of the reasons we created Sysbox. We asked ourselves: why is it that only micro-service apps are Dockerized? Why aren't entire dev or test environments Dockerized too? The latter is really useful as you've seen (when done easily & securely), and it's the reason why Sysbox exists. Glad you are finding it useful!

adamgit commented 2 years ago

To be clear: I have no complaints! It's disappointing that WSL2 isn't supported yet - and it undermines my core use-case - but it works fantastically well for teams where we can guarantee everyone's already using linux.

But I've been pushing for fully containerized dev/test since early 2000's, so I can wait another 6 months for us to finally get there ;)...

On Wed, Dec 29, 2021 at 9:45 PM Cesar Talledo @.***> wrote:

Hi @adamgit https://github.com/adamgit,

Thank you very much for the useful feedback, and apologies for Sysbox not working on WSL2 yet.

We definitely want to enable Sysbox on WSL2, but have been swamped with other work. Let me sync-up with @rodnymolina https://github.com/rodnymolina to see if we can get this going and hopefully can deliver something in January 2022.

Regarding:

I need to dockerize their docker

Yes, this is one of the reasons we created Sysbox. We asked ourselves: why is it that only micro-service apps are Dockerized? Why aren't entire dev or test environments Dockerized too? The latter is really useful as you've seen, as it the reason why Sysbox exists. Glad you are finding it useful!

— Reply to this email directly, view it on GitHub https://github.com/nestybox/sysbox/issues/32#issuecomment-1002787412, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACHL6P5XGKZ452TPH25WTTUTN6QDANCNFSM4P754O5A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

felipecrs commented 2 years ago

Out of curiosity, the WSL_DISTRO_NAME environment variable can be reliably used to detect whether running under a WSL2 environment or not.

rodnymolina commented 2 years ago

Got it, will check it out to see if this simplifies the WSL2 support effort.

rodnymolina commented 2 years ago

I just looked into this to asses the level-of-effort required to support WSL2. I started by fixing the unprivileged_userns_clone error described above. All good till there.

However, then I discovered a few other issues that need resolution:

arukiidou commented 2 years ago

@rodnymolina

these issues resoleved.

How did do it?

Here is an example of a kernel module build. see also: https://github.com/cilium/cilium/issues/17745#issuecomment-1004299480

CONFIG_CONFIGFS_FS=m

# variables:
KCONFIG_CONFIG=Microsoft/config-wsl
KERNELRELEASE=5.15.57.1-microsoft-standard-WSL2

apt-get update && apt-get install -y build-essential flex bison dwarves libssl-dev libelf-dev python-is-python3 bc
make -j 7
make modules_install -j 7
make install -j 7
depmod
lsmod
modprobe configfs

What is my condition now?

wsl.conf

root@myconputer~# cat  /etc/wsl.conf
#[boot]
#command="/usr/libexec/wsl-systemd"

my kernel module

root@myconputer:~# lsmod
Module                  Size  Used by
configfs               45056  1

root@myconputer:/lib/modules# tree
.
└── 5.15.57.1-microsoft-standard-WSL2
    ├── build -> /builds/publics/WSL2-Linux-Kernel
    ├── kernel
    │   └── fs
    │       └── configfs
    │           └── configfs.ko
    ├── modules.alias
    ├── modules.alias.bin
    ├── modules.builtin
    ├── modules.builtin.alias.bin
    ├── modules.builtin.bin
    ├── modules.builtin.modinfo
    ├── modules.dep
    ├── modules.dep.bin
    ├── modules.devname
    ├── modules.order
    ├── modules.softdep
    ├── modules.symbols
    ├── modules.symbols.bin
    └── source -> /builds/publics/WSL2-Linux-Kernel

4 directories, 16 files

sysbox-mgr is enabled

root@myconputer~# systemctl list-units -t service --all | grep sysbox
#  sysbox-fs.service                      loaded    active   running sysbox-fs (part of the Sysbox container runtime)
#  sysbox-mgr.service                     loaded    active   running sysbox-mgr (part of the Sysbox container runtime)
#  sysbox.service                         loaded    active   running Sysbox container runtime
arukiidou commented 2 years ago

see: https://github.com/nestybox/sysbox/issues/439#issuecomment-1233313753

rodnymolina commented 2 years ago

@arukiidou, thanks for taking the time to make Sysbox work within WSL2 and for sharing the outcome of your effort.

The good news is that most of these issues are fixed now, so they will be all part of our upcoming release. In the meantime, you can build Sysbox from sources and try WSL2 once again.

These are the WSL2 limitations that I previously alluded to and the actions that we have carried out to mitigate them:

Please let us know if you run into any other issue.

Thanks!

felipecrs commented 2 years ago

BTW Systemd is now available for WSL2:

https://devblogs.microsoft.com/commandline/systemd-support-is-now-available-in-wsl/

audrenbdb commented 2 years ago

configfs kernel module requirement: This is not a must-have requirement anymore

Running sudo ./scr/sysbox (from source installation instruction):

Could not load configfs kernel module. Exiting ...

And I can't proceed installation.

rodnymolina commented 2 years ago

@audrenbdb, I forgot to remove the configfs requirement from the scr/sysbox script -- typically we only use this one for testing, but it comes in handy in scenarios like yours where you're building from source, so you're right, this has to be fixed.

I'll submit a fix for this one tomorrow. In the meantime, just remove the configfs line from this file, and try again.

arukiidou commented 1 year ago

What about iptables-module? WSl2 has any modules, all built-in.

ctalledo commented 1 year ago

FYI, we should have this soon (before end-of-year), as we are currently working on adding support for Sysbox in WSL2 for Docker Desktop.

matifali commented 1 year ago

any updates?

ctalledo commented 1 year ago

Hi @matifali, we are actively working on it but WSL brings some limitations that make it a bit hard (see here for example). Won't be ready for the upcoming Sysbox release unfortunately, but rather the one after (~2->3 months).

meicale commented 1 year ago

Hi @matifali, we are actively working on it but WSL brings some limitations that make it a bit hard (see here for example). Won't be ready for the upcoming Sysbox release unfortunately, but rather the one after (~2->3 months).

It will be really cool to have this on WLS2, as I am looking for my personal dev-setting-ups.

arukiidou commented 1 year ago

Thanks for releasing the new sysbox-runc.

What did you do?

enviroment

root@~:/usr/bin# docker info
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.10.4
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.17.2
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 1
 Server Version: 23.0.3
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc sysbox-runc
 Default Runtime: sysbox-runc
 Init Binary: docker-init
 containerd version: 2806fc1057397dbaeefbea0e4e17bddfbd388f38
 runc version:
 init version: de40ad0
 Security Options:
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 6.1.21-060121-generic
 Operating System: Ubuntu 22.04.2 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 12
 Total Memory: 31.26GiB
arukiidou commented 1 year ago

custom kernel patch is here.

From e107edb1ccea900a0708719508a4e2c15a5cfd3a Mon Sep 17 00:00:00 2001
From: Administrator <admin@example.com>
Date: Sat, 8 Apr 2023 22:06:58 +0900
[e107edb1ccea900a0708719508a4e2c15a5cfd3a.patch](https://github.com/nestybox/sysbox/files/11183337/e107edb1ccea900a0708719508a4e2c15a5cfd3a.patch)

Subject: [PATCH] linux-msft-wsl-6.1.y

---
 .gitlab-ci.yml                 | 17 +++++++++++++++++
 Makefile                       |  2 +-
 Microsoft/config-wsl           | 23 ++++++++++++++++-------
 include/linux/user_namespace.h |  4 ++++
 init/Kconfig                   | 16 ++++++++++++++++
 kernel/fork.c                  | 13 +++++++++++++
 kernel/sysctl.c                | 12 ++++++++++++
 kernel/user_namespace.c        |  7 +++++++
 8 files changed, 86 insertions(+), 8 deletions(-)
 create mode 100644 .gitlab-ci.yml

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
new file mode 100644
index 0000000000000..8129cef17001a
--- /dev/null
+++ b/.gitlab-ci.yml
@@ -0,0 +1,17 @@
+#https://github.com/microsoft/WSL2-Linux-Kernel#build-instructions
+#apt install build-essential flex bison dwarves libssl-dev libelf-dev
+# +some libs required
+#/usr/bin/env: 'python3': No such file or directory => [python-is-python3]
+#/bin/sh: 1: bc: not found => [bc]
+# image: ubuntu:22.04
+# before_script:
+#   - apt-get update && apt-get install -y build-essential flex bison dwarves libssl-dev libelf-dev python-is-python3 bc
+
+build:
+  needs: []
+  image: ${CI_REGISTRY_IMAGE}/builder:22.04-ubuntu
+  script:
+    - make -j 7 KCONFIG_CONFIG=Microsoft/config-wsl
+  artifacts:
+    paths:
+      - arch/x86/boot/bzImage*
\ No newline at end of file
diff --git a/Makefile b/Makefile
index 200583e18442a..a0c66b41fcccc 100644
--- a/Makefile
+++ b/Makefile
@@ -2,7 +2,7 @@
 VERSION = 6
 PATCHLEVEL = 1
 SUBLEVEL = 21
-EXTRAVERSION = .1
+EXTRAVERSION = -060121
 NAME = Hurr durr I'ma ninja sloth

 # *DOCUMENTATION*
diff --git a/Microsoft/config-wsl b/Microsoft/config-wsl
index e02f2657978e8..c657fc3533e49 100644
--- a/Microsoft/config-wsl
+++ b/Microsoft/config-wsl
@@ -21,13 +21,27 @@ CONFIG_IRQ_WORK=y
 CONFIG_BUILDTIME_TABLE_SORT=y
 CONFIG_THREAD_INFO_IN_TASK=y

+# adds support for clientIP-based session affinity
+# https://github.com/microsoft/WSL/issues/7124
+CONFIG_NETFILTER_XT_MATCH_RECENT=y
+# Requirements for L7 and FQDN Policies
+# https://docs.cilium.io/en/v1.13/operations/system_requirements/#requirements-for-l7-and-fqdn-policies
+CONFIG_NETFILTER_XT_TARGET_TPROXY=y
+CONFIG_NETFILTER_XT_TARGET_CT=y
+CONFIG_NETFILTER_XT_MATCH_MARK=y
+CONFIG_NETFILTER_XT_MATCH_SOCKET=y
+# Requirements XDP
+# https://cateee.net/lkddb/web-lkddb/XDP_SOCKETS.html
+CONFIG_XDP_SOCKETS=y
+CONFIG_XDP_SOCKETS_DIAG=y
+
 #
 # General setup
 #
 CONFIG_INIT_ENV_ARG_LIMIT=32
 # CONFIG_COMPILE_TEST is not set
 # CONFIG_WERROR is not set
-CONFIG_LOCALVERSION="-microsoft-standard-WSL2"
+CONFIG_LOCALVERSION="-generic"
 # CONFIG_LOCALVERSION_AUTO is not set
 CONFIG_BUILD_SALT=""
 CONFIG_HAVE_KERNEL_GZIP=y
@@ -210,6 +224,7 @@ CONFIG_UTS_NS=y
 CONFIG_TIME_NS=y
 CONFIG_IPC_NS=y
 CONFIG_USER_NS=y
+CONFIG_USER_NS_UNPRIVILEGED=y
 CONFIG_PID_NS=y
 CONFIG_NET_NS=y
 CONFIG_CHECKPOINT_RESTORE=y
@@ -1005,7 +1020,6 @@ CONFIG_XFRM_USER=y
 # CONFIG_XFRM_STATISTICS is not set
 CONFIG_XFRM_ESP=y
 # CONFIG_NET_KEY is not set
-# CONFIG_XDP_SOCKETS is not set
 CONFIG_INET=y
 # CONFIG_IP_MULTICAST is not set
 CONFIG_IP_ADVANCED_ROUTER=y
@@ -1161,7 +1175,6 @@ CONFIG_NETFILTER_XT_SET=y
 CONFIG_NETFILTER_XT_TARGET_CHECKSUM=y
 # CONFIG_NETFILTER_XT_TARGET_CLASSIFY is not set
 # CONFIG_NETFILTER_XT_TARGET_CONNMARK is not set
-# CONFIG_NETFILTER_XT_TARGET_CT is not set
 # CONFIG_NETFILTER_XT_TARGET_DSCP is not set
 CONFIG_NETFILTER_XT_TARGET_HL=y
 # CONFIG_NETFILTER_XT_TARGET_HMARK is not set
@@ -1177,7 +1190,6 @@ CONFIG_NETFILTER_XT_TARGET_NFLOG=y
 CONFIG_NETFILTER_XT_TARGET_REDIRECT=y
 CONFIG_NETFILTER_XT_TARGET_MASQUERADE=y
 # CONFIG_NETFILTER_XT_TARGET_TEE is not set
-# CONFIG_NETFILTER_XT_TARGET_TPROXY is not set
 # CONFIG_NETFILTER_XT_TARGET_TRACE is not set
 CONFIG_NETFILTER_XT_TARGET_SECMARK=y
 CONFIG_NETFILTER_XT_TARGET_TCPMSS=y
@@ -1212,7 +1224,6 @@ CONFIG_NETFILTER_XT_MATCH_IPVS=y
 # CONFIG_NETFILTER_XT_MATCH_LENGTH is not set
 CONFIG_NETFILTER_XT_MATCH_LIMIT=y
 # CONFIG_NETFILTER_XT_MATCH_MAC is not set
-# CONFIG_NETFILTER_XT_MATCH_MARK is not set
 CONFIG_NETFILTER_XT_MATCH_MULTIPORT=y
 # CONFIG_NETFILTER_XT_MATCH_NFACCT is not set
 # CONFIG_NETFILTER_XT_MATCH_OSF is not set
@@ -1223,9 +1234,7 @@ CONFIG_NETFILTER_XT_MATCH_PHYSDEV=y
 # CONFIG_NETFILTER_XT_MATCH_QUOTA is not set
 # CONFIG_NETFILTER_XT_MATCH_RATEEST is not set
 # CONFIG_NETFILTER_XT_MATCH_REALM is not set
-# CONFIG_NETFILTER_XT_MATCH_RECENT is not set
 # CONFIG_NETFILTER_XT_MATCH_SCTP is not set
-# CONFIG_NETFILTER_XT_MATCH_SOCKET is not set
 # CONFIG_NETFILTER_XT_MATCH_STATE is not set
 CONFIG_NETFILTER_XT_MATCH_STATISTIC=y
 # CONFIG_NETFILTER_XT_MATCH_STRING is not set
diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
index 45f09bec02c48..87b20e2ee2744 100644
--- a/include/linux/user_namespace.h
+++ b/include/linux/user_namespace.h
@@ -148,6 +148,8 @@ static inline void set_userns_rlimit_max(struct user_namespace *ns,

 #ifdef CONFIG_USER_NS

+extern int unprivileged_userns_clone;
+
 static inline struct user_namespace *get_user_ns(struct user_namespace *ns)
 {
    if (ns)
@@ -181,6 +183,8 @@ extern bool current_in_userns(const struct user_namespace *target_ns);
 struct ns_common *ns_get_owner(struct ns_common *ns);
 #else

+#define unprivileged_userns_clone 0
+
 static inline struct user_namespace *get_user_ns(struct user_namespace *ns)
 {
    return &init_user_ns;
diff --git a/init/Kconfig b/init/Kconfig
index 0c214af99085d..21028b50fc2d1 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1251,6 +1251,22 @@ config USER_NS

      If unsure, say N.

+config USER_NS_UNPRIVILEGED
+   bool "Allow unprivileged users to create namespaces"
+   depends on USER_NS
+   default n
+   help
+     When disabled, unprivileged users will not be able to create
+     new namespaces. Allowing users to create their own namespaces
+     has been part of several recent local privilege escalation
+     exploits, so if you need user namespaces but are
+     paranoid^Wsecurity-conscious you want to disable this.
+
+     This setting can be overridden at runtime via the
+     kernel.unprivileged_userns_clone sysctl.
+
+     If unsure, say N.
+
 config PID_NS
    bool "PID Namespaces"
    default y
diff --git a/kernel/fork.c b/kernel/fork.c
index a6d243a50be3e..857cf7c1517fe 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -82,6 +82,9 @@
 #include <linux/perf_event.h>
 #include <linux/posix-timers.h>
 #include <linux/user-return-notifier.h>
+#ifdef CONFIG_USER_NS
+#include <linux/user_namespace.h>
+#endif
 #include <linux/oom.h>
 #include <linux/khugepaged.h>
 #include <linux/signalfd.h>
@@ -2011,6 +2014,10 @@ static __latent_entropy struct task_struct *copy_process(
    if ((clone_flags & (CLONE_NEWUSER|CLONE_FS)) == (CLONE_NEWUSER|CLONE_FS))
        return ERR_PTR(-EINVAL);

+   if ((clone_flags & CLONE_NEWUSER) && !unprivileged_userns_clone)
+       if (!capable(CAP_SYS_ADMIN))
+           return ERR_PTR(-EPERM);
+
    /*
     * Thread groups must share signals as well, and detached threads
     * can only be started up within the thread group.
@@ -3171,6 +3178,12 @@ int ksys_unshare(unsigned long unshare_flags)
    if (unshare_flags & CLONE_NEWNS)
        unshare_flags |= CLONE_FS;

+   if ((unshare_flags & CLONE_NEWUSER) && !unprivileged_userns_clone) {
+       err = -EPERM;
+       if (!capable(CAP_SYS_ADMIN))
+           goto bad_unshare_out;
+   }
+
    err = check_unshare_flags(unshare_flags);
    if (err)
        goto bad_unshare_out;
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index c6d9dec11b749..9a4514ad481b2 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -81,6 +81,9 @@
 #ifdef CONFIG_RT_MUTEXES
 #include <linux/rtmutex.h>
 #endif
+#ifdef CONFIG_USER_NS
+#include <linux/user_namespace.h>
+#endif

 /* shared constants to be used in various sysctls */
 const int sysctl_vals[] = { 0, 1, 2, 3, 4, 100, 200, 1000, 3000, INT_MAX, 65535, -1 };
@@ -1659,6 +1662,15 @@ static struct ctl_table kern_table[] = {
        .mode       = 0644,
        .proc_handler   = proc_dointvec,
    },
+#ifdef CONFIG_USER_NS
+   {
+       .procname   = "unprivileged_userns_clone",
+       .data       = &unprivileged_userns_clone,
+       .maxlen     = sizeof(int),
+       .mode       = 0644,
+       .proc_handler   = proc_dointvec,
+   },
+#endif
 #ifdef CONFIG_PROC_SYSCTL
    {
        .procname   = "tainted",
diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index 54211dbd516c5..16ca0c1516298 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -22,6 +22,13 @@
 #include <linux/bsearch.h>
 #include <linux/sort.h>

+/* sysctl */
+#ifdef CONFIG_USER_NS_UNPRIVILEGED
+int unprivileged_userns_clone = 1;
+#else
+int unprivileged_userns_clone;
+#endif
+
 static struct kmem_cache *user_ns_cachep __read_mostly;
 static DEFINE_MUTEX(userns_state_mutex);

-- 
GitLab
arukiidou commented 8 months ago

@rodnymolina @ctalledo Sysbox on WSL2 is now available. Please check it out.

ctalledo commented 8 months ago

@rodnymolina @ctalledo Sysbox on WSL2 is now available. Please check it out.

Thanks @arukiidou for the contribution!

felipecrs commented 5 months ago

I tried 0.6.4 on WSL just now, and unfortunately it doesn't seem to work:

❯ wget https://downloads.nestybox.com/sysbox/releases/v0.6.4/sysbox-ce_0.6.4-0.linux_amd64.deb

❯ sudo apt install ./sysbox-ce_0.6.4-0.linux_amd64.deb
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
Note, selecting 'sysbox-ce' instead of './sysbox-ce_0.6.4-0.linux_amd64.deb'
The following NEW packages will be installed:
  sysbox-ce
0 upgraded, 1 newly installed, 0 to remove and 6 not upgraded.
Need to get 0 B/11.8 MB of archives.
After this operation, 39.9 MB of additional disk space will be used.
Get:1 /home/felipecrs/sysbox-ce_0.6.4-0.linux_amd64.deb sysbox-ce amd64 0.6.4.linux [11.8 MB]
Selecting previously unselected package sysbox-ce.
(Reading database ... 60326 files and directories currently installed.)
Preparing to unpack .../sysbox-ce_0.6.4-0.linux_amd64.deb ...
Unpacking sysbox-ce (0.6.4.linux) ...
Setting up sysbox-ce (0.6.4.linux) ...
WSL2 detected, enable_unprivileged_userns skipped.
WSL2 detected, check_kernel_headers skipped.
Created symlink /etc/systemd/system/sysbox.service.wants/sysbox-fs.service → /lib/systemd/system/sysbox-fs.service.
Created symlink /etc/systemd/system/sysbox.service.wants/sysbox-mgr.service → /lib/systemd/system/sysbox-mgr.service.
Created symlink /etc/systemd/system/multi-user.target.wants/sysbox.service → /lib/systemd/system/sysbox.service.

❯ cat /etc/docker/daemon.json
{
    "runtimes": {
        "sysbox-runc": {
            "path": "/usr/bin/sysbox-runc"
        }
    },
    "bip": "172.20.0.1/16",
    "default-address-pools": [
        {
            "base": "172.25.0.0/16",
            "size": 24
        }
    ]
}

❯ docker run --rm ubuntu printenv
HOME=/root
HOSTNAME=74e8bfb9f468
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

❯ docker run --rm --runtime=sysbox-runc ubuntu printenv
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: container_linux.go:439: starting container process caused: init_linux.go:663: loading seccomp notification rules caused: error loading seccomp filter into kernel: error loading seccomp filter: device or resource busy: unknown.
arukiidou commented 5 months ago

@felipecrs can you run these ? I will see if this is the case for all or unique to the image

# ✅OK
services:
  dind-sysbox:
    image: docker.io/library/docker:24.0.7-alpine3.19
    container_name: dind
    runtime: sysbox-runc
    privileged: false
    tty: true
felipecrs commented 5 months ago
❯ docker run --rm --runtime=sysbox-runc --tty docker:24.0.7-alpine3.19 true
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: container_linux.go:439: starting container process caused: init_linux.go:663: loading seccomp notification rules caused: error loading seccomp filter into kernel: error loading seccomp filter: device or resource busy: unknown.
arukiidou commented 5 months ago

Okay, I think this is due to the difference between my environment and yours.

please run

wsl --version and docker info

And please let me know your distro.

felipecrs commented 5 months ago
❯ wsl.exe --version
WSL version: 2.2.2.0
Kernel version: 5.15.150.1-2
WSLg version: 1.0.61
MSRDC version: 1.2.5105
Direct3D version: 1.611.1-81528511
DXCore version: 10.0.25131.1002-220531-1700.rs-onecore-base2-hyp
Windows version: 10.0.22631.3447

❯ docker info
Client: Docker Engine - Community
 Version:    26.0.1
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.13.1
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.26.1
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 13
 Server Version: 26.0.1
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc sysbox-runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: e377cd56a71523140ca6ae87e30244719194a521
 runc version: v1.1.12-0-g51d5e94
 init version: de40ad0
 Security Options:
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 5.15.150.1-microsoft-standard-WSL2
 Operating System: Ubuntu 22.04.4 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 16
 Total Memory: 15.62GiB
 Name: FELIPE-MSI
 ID: fe0d4b33-6e01-4e84-819c-624eacf4eb44
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
 Default Address Pools:
   Base: 172.25.0.0/16, Size: 24
felipecrs commented 5 months ago

PS: I'm not running Docker Desktop. I'm running the normal Docker CE within Ubuntu 22.04 with Systemd enabled.

arukiidou commented 5 months ago

Definitely seems to work with a clean install.

sudo docker run --rm --runtime=sysbox-runc --tty docker:26.0.1-alpine3.19 'whoami'
Unable to find image 'docker:26.0.1-alpine3.19' locally
26.0.1-alpine3.19: Pulling from library/docker
4abcf2066143: Pull complete
c4e2d7dc11fc: Pull complete
4f4fb700ef54: Pull complete
6f626593b7fc: Pull complete
35989916a4c8: Pull complete
91720930f6b4: Pull complete
4cb5f5ca57cf: Pull complete
ffceb7a3bb44: Pull complete
d958f8a5b7f6: Pull complete
b87df04eca41: Pull complete
23a939a5ff63: Pull complete
2e0dc7f61ba8: Pull complete
2fefa731ee4e: Pull complete
6e69fab3d884: Pull complete
ae9b60e39af4: Pull complete
a5d3abca5eb3: Pull complete
Digest: sha256:a2d55c6061a342e42db62654b7b7cdf16113828a80b3827cbd9453806c08549c
Status: Downloaded newer image for docker:26.0.1-alpine3.19
root

sysbox@MYPC:~$

arukiidou commented 5 months ago
wsl --version
WSL バージョン: 2.1.5.0
カーネル バージョン: 5.15.146.1-2
WSLg バージョン: 1.0.60
MSRDC バージョン: 1.2.5105
Direct3D バージョン: 1.611.1-81528511
DXCore バージョン: 10.0.25131.1002-220531-1700.rs-onecore-base2-hyp
Windows バージョン: 10.0.22631.3447
sudo docker info
Client: Docker Engine - Community
 Version:    26.0.1
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.13.1
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.26.1
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 2
 Server Version: 26.0.1
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc sysbox-runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: e377cd56a71523140ca6ae87e30244719194a521
 runc version: v1.1.12-0-g51d5e94
 init version: de40ad0
 Security Options:
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 5.15.146.1-microsoft-standard-WSL2
 Operating System: Ubuntu 22.04.4 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 12
 Total Memory: 31.26GiB
 Name: MYPC
 ID: bcf6e411-a60b-48fc-bbe3-8f9220a5e0ff
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

@felipecrs

felipecrs commented 5 months ago

You can also test it on another distro with a clean install?

Will try.

felipecrs commented 5 months ago

please dont use pre-install version wsl.

You can also try after wsl.exe --update --pre-release.

felipecrs commented 5 months ago

You can also test it on another distro with a clean install?

Exact same issue, I just spinned up a new Ubuntu 22.04 distro:

$ wsl.exe --install Ubuntu-22.04

$ printf '%s\n' '[boot]' 'systemd = true' | sudo tee /etc/wsl.conf

$ wsl.exe --shutdown

$ sh -c "$(curl -fsSL get.docker.com)"

$ sudo usermod -aG docker $USER

$ wget https://downloads.nestybox.com/sysbox/releases/v0.6.4/sysbox-ce_0.6.4-0.linux_amd64.deb

$ sudo apt install ./sysbox-ce_0.6.4-0.linux_amd64.deb

$ wsl.exe --shutdown

$ docker run --rm --runtime=sysbox-runc ubuntu printenv
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: container_linux.go:439: starting container process caused: init_linux.go:663: loading seccomp notification rules caused: error loading seccomp filter into kernel: error loading seccomp filter: device or resource busy: unknown.
ctalledo commented 5 months ago

Hi @felipecrs,

I just tried sysbox-ce v0.6.4 on my WSL Ubuntu 22.04 distro and it worked perfectly:

$ wget https://downloads.nestybox.com/sysbox/releases/v0.6.4/sysbox-ce_0.6.4-0.linux_amd64.deb

$ sudo apt install ./sysbox-ce_0.6.4-0.linux_amd64.deb

$ docker run --runtime=sysbox-runc -it --rm nestybox/ubuntu-jammy-systemd-docker
Welcome to Ubuntu 22.04.1 LTS!

[  OK  ] Created slice Slice /system/getty.
[  OK  ] Created slice Slice /system/modprobe.
...
admin@54fb90f06e63:~$

I wonder why you are getting the error loading seccomp filter into kernel. Make sure you have a recent kernel (5.12+).

Question: Does Docker work without Sysbox?

Here's some more info on my WSL setup:

C:\Users\Cesar> wsl --version
WSL version: 2.1.5.0
Kernel version: 5.15.146.1-2
WSLg version: 1.0.60
MSRDC version: 1.2.5105
Direct3D version: 1.611.1-81528511
DXCore version: 10.0.25131.1002-220531-1700.rs-onecore-base2-hyp
Windows version: 10.0.22635.2921

C:\Users\Cesar> cat C:\Users\Cesar\.wslconfig
[wsl2]
# races with Desktop's port forwarding
localhostForwarding=false
[experimental]
autoMemoryReclaim=gradual

And inside the Ubuntu-22.04 WSL distro:

$ uname -a
Linux xps15 5.15.146.1-microsoft-standard-WSL2 #1 SMP Thu Jan 11 04:09:03 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.1 LTS
Release:        22.04
Codename:       jammy

$ cat /etc/wsl.conf
[boot]
systemd=true
[interop]
appendWindowsPath=falsec

And I am using docker engine v26.0.1 inside the Ubuntu-22.04 distro.

felipecrs commented 5 months ago

@ctalledo thanks a lot. Yes, docker works normally. I will try isolate the problem, it could be due to the experimental features I enabled in WSL, like auto memory reclaim (.wslconfig).

ctalledo commented 5 months ago

@ctalledo thanks a lot. Yes, docker works normally. I will try isolate the problem, it could be due to the experimental features I enabled in WSL, like auto memory reclaim (.wslconfig).

It's definitely not autoMemoryReclaim, since I have that too (see my previous comment).

What's your WSL distro's kernel version (uname -a)?

felipecrs commented 5 months ago

Sorry, not home now to run the command, but my previous wsl --version said 5.15.150.1-2.

felipecrs commented 5 months ago

I was able to isolate the problem. Removing networkingMode=mirrored from my %USERPROFILE%\.wslconfig makes the problem go away.

Since @ctalledo, you reported https://github.com/microsoft/WSL/issues/9548, I wonder if it's the same problem or if it's worth letting them know of this quirk which happens only with networkingMode=mirrored.