Closed woutervb closed 2 years ago
I suspect the issue is that your container runs Ubuntu 20.04 and likely expects a cgroup1 layout, your host system is running impish which comes with cgroup2, this then results in a fair bit of confusion with snapd and that issue.
Can you try booting your host system with systemd.unified_cgroup_hierarchy=false
passed to the kernel command line?
Ah, the related forum thread mentions something about network sockets, it'd be good to have the dmesg
output for those.
Hi, I cannot give you the dmesg, but I have a pastebin that might prove useful https://pastebin.canonical.com/p/pDzRRpwVT6/
Currently back on Focal, as this problem was blocking me.
The other info I can provide, is that running the command snap-proxy status
inside the container gave the following output:
WARNING: cgroup v2 is not fully supported yet, proceeding with partial confinement
Store ID: not registered
Internal Service Status:
memcached: running
nginx: running
snapauth: not running: 500 Server Error: INTERNAL SERVER ERROR for url: http://127.0.0.1:8005/_status/check
snapdevicegw: not running: [Errno 111] Connection refused
snapdevicegw-local: not running: [Errno 111] Connection refused
snapproxy: not running: [Errno 111] Connection refused
snaprevs: not running: 500 Server Error: INTERNAL SERVER ERROR for url: http://127.0.0.1:8002/_status/check
Which does indeed give me the indication that something cgroup related is going on.
In the attached pastebin, there are lines like:
2021-12-02T12:07:49Z snap-store-proxy.snapdevicegw[18446]: 2021-12-02 12:07:49.935Z ERROR gunicorn.error "Can't connect to /var/snap/snap-store-proxy/78/snapdevicegw/snapdevicegw.sock"
Which does point to socket files that don't work, which is basically the reason for things failing as far as I can find.
WARNING: cgroup v2 is not fully supported yet, proceeding with partial confinement
Is what I suspected with running snaps in a 20.04 container on a 21.10 host, if that's the source of the issue, then there's nothing we can do as that's a snapd deficiency (which hopefully can be fixed in their 20.04 build).
I've reproduced the issue here and looking at the kernel log, I'm seeing things like:
[ 264.905113] audit: type=1400 audit(1638928972.864:240): apparmor="DENIED" operation="capable" namespace="root//lxd-test_<var-snap-lxd-common-lxd>" profile="snap.snap-store-proxy.snapproxy" pid=4877 comm="python3" capability=0 capname="chown"
[ 265.419787] audit: type=1400 audit(1638928973.380:241): apparmor="DENIED" operation="capable" namespace="root//lxd-test_<var-snap-lxd-common-lxd>" profile="snap.snap-store-proxy.snapdevicegw" pid=4903 comm="python3" capability=0 capname="chown"
[ 267.642834] audit: type=1400 audit(1638928975.600:242): apparmor="DENIED" operation="capable" namespace="root//lxd-test_<var-snap-lxd-common-lxd>" profile="snap.snap-store-proxy.snapproxy" pid=4924 comm="python3" capability=0 capname="chown"
[ 269.820697] audit: type=1400 audit(1638928977.780:243): apparmor="DENIED" operation="mknod" namespace="root//lxd-test_<var-snap-lxd-common-lxd>" profile="snap.snap-store-proxy.snapdevicegw" name="/dev/shm/RMtCRJ" pid=4948 comm="python3" requested_mask="c" denied_mask="c" fsuid=1000000 ouid=1000000
[ 269.917270] audit: type=1400 audit(1638928977.876:244): apparmor="DENIED" operation="capable" namespace="root//lxd-test_<var-snap-lxd-common-lxd>" profile="snap.snap-store-proxy.snapdevicegw" pid=4948 comm="python3" capability=0 capname="chown"
[ 270.647676] audit: type=1400 audit(1638928978.608:245): apparmor="DENIED" operation="capable" namespace="root//lxd-test_<var-snap-lxd-common-lxd>" profile="snap.snap-store-proxy.snapproxy" pid=4924 comm="python3" capability=0 capname="chown"
[ 270.919389] audit: type=1400 audit(1638928978.880:246): apparmor="DENIED" operation="capable" namespace="root//lxd-test_<var-snap-lxd-common-lxd>" profile="snap.snap-store-proxy.snapdevicegw" pid=4948 comm="python3" capability=0 capname="chown"
[ 271.649725] audit: type=1400 audit(1638928979.612:247): apparmor="DENIED" operation="capable" namespace="root//lxd-test_<var-snap-lxd-common-lxd>" profile="snap.snap-store-proxy.snapproxy" pid=4924 comm="python3" capability=0 capname="chown"
[ 271.921854] audit: type=1400 audit(1638928979.884:248): apparmor="DENIED" operation="capable" namespace="root//lxd-test_<var-snap-lxd-common-lxd>" profile="snap.snap-store-proxy.snapdevicegw" pid=4948 comm="python3" capability=0 capname="chown"
[ 274.386962] audit: type=1400 audit(1638928982.344:249): apparmor="DENIED" operation="capable" namespace="root//lxd-test_<var-snap-lxd-common-lxd>" profile="snap.snap-store-proxy.snapproxy" pid=4968 comm="python3" capability=0 capname="chown"
[ 276.320810] audit: type=1400 audit(1638928984.280:250): apparmor="DENIED" operation="mknod" namespace="root//lxd-test_<var-snap-lxd-common-lxd>" profile="snap.snap-store-proxy.snapdevicegw" name="/dev/shm/HYIIf1" pid=4992 comm="python3" requested_mask="c" denied_mask="c" fsuid=1000000 ouid=1000000
[ 276.417149] audit: type=1400 audit(1638928984.376:251): apparmor="DENIED" operation="capable" namespace="root//lxd-test_<var-snap-lxd-common-lxd>" profile="snap.snap-store-proxy.snapdevicegw" pid=4992 comm="python3" capability=0 capname="chown"
[ 277.391819] audit: type=1400 audit(1638928985.352:252): apparmor="DENIED" operation="capable" namespace="root//lxd-test_<var-snap-lxd-common-lxd>" profile="snap.snap-store-proxy.snapproxy" pid=4968 comm="python3" capability=0 capname="chown"
[ 277.418795] audit: type=1400 audit(1638928985.380:253): apparmor="DENIED" operation="capable" namespace="root//lxd-test_<var-snap-lxd-common-lxd>" profile="snap.snap-store-proxy.snapdevicegw" pid=4992 comm="python3" capability=0 capname="chown"
[ 278.393627] audit: type=1400 audit(1638928986.356:254): apparmor="DENIED" operation="capable" namespace="root//lxd-test_<var-snap-lxd-common-lxd>" profile="snap.snap-store-proxy.snapproxy" pid=4968 comm="python3" capability=0 capname="chown"
[ 278.421197] audit: type=1400 audit(1638928986.380:255): apparmor="DENIED" operation="capable" namespace="root//lxd-test_<var-snap-lxd-common-lxd>" profile="snap.snap-store-proxy.snapdevicegw" pid=4992 comm="python3" capability=0 capname="chown"
All of those are for snapd generated apparmor profiles and we're indeed seeing a lot of failures in there.
Just for completeness, I've also installed snap-store-proxy directly on the 21.10 system, it's looking moderately happier but still won't work:
root@impish:~# snap-store-proxy status
Store ID: not registered
Internal Service Status:
memcached: running
nginx: running
snapauth: not running: 500 Server Error: INTERNAL SERVER ERROR for url: http://127.0.0.1:8005/_status/check
snapdevicegw: running
snapdevicegw-local: running
snapproxy: running
snaprevs: not running: 500 Server Error: INTERNAL SERVER ERROR for url: http://127.0.0.1:8002/_status/check
So as this issue persists with LXD completely removed from the equation, I'd strongly recommend you file a bug against snap-store-proxy and/or snapd to have this looked at and resolved.
That it doesn't work is as it is not configured / registered, but you got at the state that is expected. Will open a case with snapd and see what they can do.
Hmm, it was still spewing a lot of DENIED in dmesg
even when run outside of a container, so there's something a bit odd going on with that snap. I also suspect that the snapd team hasn't been very actively testing snapd inside of a pre-cgroup2 container on a cgroup2 host as that's quite a rare setup at this stage.
I left a note in LP https://bugs.launchpad.net/lxd/+bug/1953563/comments/1 but it does not appear to be related to cgroups v2. I tried some smaller snaps, all behaved correctly. I have launched a couple of configurations (21.10 on 21.10, 20.04 on 21.10, 20.04 on 20.04). Indeed there appears to be a problem with 21.10 as a host which is observed with a nested instance of 21.10 and 20.04. However, disabling apparmor in lxd makes the problems go away (lxc config set ... lxc.raw 'lxc.apparmor.profile=unconfined'
, the container has to be made privileged at this point too), this applies to both setups with 21.10 as the host. I have a hunch that the problem is with AppArmor 3 which is new compared to 20.04 (although it was introduced in 21.04).
@bboozzoo, @stgraber can either of you contact the snapd team directly? As that ticket I opened on lp now bounces me back to you.
@woutervb I am on the snapd team, anyways, I see that @stgraber has already identified a potential problem in VFS idmapping and the bug has been reassigned to the kernel for further investigation.
This can be closed as it is a kernel problem
Required information
Issue description
Installing the snap-store-proxy snap inside the container results in only 2 services running, while it should be that only 2 don't run. Container can be either Bionic or Focal
Steps to reproduce
Information to attach