canonical / lxd

Powerful system container and virtual machine manager
https://canonical.com/lxd
GNU Affero General Public License v3.0
4.37k stars 930 forks source link

"Can't read from stdin" in nested container #7473

Closed kuangniu closed 4 years ago

kuangniu commented 4 years ago

While trying to setup nested container, lxc command refuse to read from stdin.

(Tried on Ubuntu 20.04 live-server kvm virtual machine)

kuangniu@focal:~$ lxc init ubuntu:20.04 c1
Creating c1
kuangniu@focal:~$ lxc config set c1 security.nesting true
kuangniu@focal:~$ lxc config set c1 user.user-data - <<EOF
> #cloud-config
> locale: en_US.UTF-8
> timezone: US/Pacific
> EOF
kuangniu@focal:~$ lxc start c1
kuangniu@focal:~$ lxc exec c1 -- bash
root@c1:~# lxd init
Would you like to use LXD clustering? (yes/no) [default=no]: 
Do you want to configure a new storage pool? (yes/no) [default=yes]: 
Name of the new storage pool [default=default]: 
Would you like to connect to a MAAS server? (yes/no) [default=no]: 
Would you like to create a new local network bridge? (yes/no) [default=yes]: 
What should the new bridge be called? [default=lxdbr0]: 
What IPv4 address should be used? (CIDR subnet notation, “auto” or “none”) [default=auto]: 
What IPv6 address should be used? (CIDR subnet notation, “auto” or “none”) [default=auto]: 
Would you like LXD to be available over the network? (yes/no) [default=no]: 
Would you like stale cached images to be updated automatically? (yes/no) [default=yes] 
Would you like a YAML "lxd init" preseed to be printed? (yes/no) [default=no]: 
root@c1:~# lxc init ubuntu:20.04 c2
Creating c2
root@c1:~# lxc config set c2 user.user-data - <<EOF
> #cloud-config
> locale: en_US.UTF-8
> timezone: US/Pacific
> EOF
Error: Can't read from stdin: %s: read /dev/stdin: permission denied
root@c1:~# 

At the error, host machine syslog shows these messages from apparmor:

Jun  1 12:36:52 focal kernel: [ 2896.153022] audit: type=1400 audit(1591015012.043:124): apparmor="DENIED" operation="file_inherit" namespace="root//lxd-c1_<var-snap-lxd-common-lxd>" profile="/usr/lib/snapd/snap-confine" name="/tmp/sh-thd.teGu3H" pid=3900 comm="snap-confine" requested_mask="r" denied_mask="r" fsuid=1000000 ouid=1000000
Jun  1 12:36:52 focal kernel: [ 2896.156658] audit: type=1400 audit(1591015012.047:125): apparmor="DENIED" operation="file_inherit" namespace="root//lxd-c1_<var-snap-lxd-common-lxd>" profile="snap.lxd.lxc" name="/apparmor/.null" pid=3900 comm="snap-exec" requested_mask="wr" denied_mask="wr" fsuid=1000000 ouid=0
Jun  1 12:36:52 focal kernel: [ 2896.170439] audit: type=1400 audit(1591015012.059:126): apparmor="DENIED" operation="file_inherit" namespace="root//lxd-c1_<var-snap-lxd-common-lxd>" profile="/usr/lib/snapd/snap-confine" name="/apparmor/.null" pid=3900 comm="aa-exec" requested_mask="wr" denied_mask="wr" fsuid=1000000 ouid=0

I have a bunch of scripts that does those things. They are all broken on Focal :(

stgraber commented 4 years ago

In the container, can you try:

script /dev/null -c /bin/bash and then in that session, run your command again?

stgraber commented 4 years ago

The above suggests either a policy issue in what's generated by snapd or an apparmor/kernel issue.

We've seen a number of such cases where AppArmor gets confused by PTS devices originating from outside the container, incorrectly refusing access. If the script trick above fixes it, it would also suggest that SSH-ing into the container would similarly work and that the issue is unfortunately on AppArmor.

kuangniu commented 4 years ago

I don't know what to tell you, but neither script nor ssh didn't help.

kuangniu@focal:~$ lxc exec c1 -- bash
root@c1:~# tty
not a tty
root@c1:~# script /dev/null -c /bin/bash
Script started, file is /dev/null
root@c1:~# tty
/dev/pts/0
root@c1:~# lxc config set c2 user.user-data - <<EOF
> #cloud-config
> locale: en_US.UTF-8
> timezone: US/Pacific
> EOF
Error: Can't read from stdin: %s: read /dev/stdin: permission denied
root@c1:~# 

(After setup ssh login)

kuangniu@focal:~$ ssh ubuntu@10.55.191.179
ubuntu@10.55.191.179's password: 
Welcome to Ubuntu 20.04 LTS (GNU/Linux 5.4.0-33-generic x86_64)
...(login message)
ubuntu@c1:~$ tty
/dev/pts/0
ubuntu@c1:~$ lxc config set c2 user.user-data - <<EOF
> #cloud-config
> locale: en_US.UTF-8
> timezone: US/Pacific
> EOF
Error: Can't read from stdin: %s: read /dev/stdin: permission denied
ubuntu@c1:~$ 

Both generated same apparmor messages as above.

stgraber commented 4 years ago

Can you post:

There is a general issue that AppArmor doesn't nest, so you only get a full apparmor namespace for the first level container, the nested container only gets to share its profile with its parent. This effectively prevents snaps from working in such a nested container.

So we just need to confirm exactly how deep your setup goes because you may have hit that limit :)

Info above should confirm the exact setup and also exact depth we're expecting.

kuangniu commented 4 years ago

I use almost exclusively ubuntu. I have used these configurations on 16.04 and 18.04 long time without a problem. I was planning to migrate to new LTS, testing it, then bumped into this issue. The HOST machine is actually a qemu-kvm virtual guest on Ubuntu 18.04 host, in case you wonder.

== Host (focal)

== 1st level container (c1)

== 2nd level container (c2)

== ps fauxww from the host

kuangniu@focal:~$ ps fauxww
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root           2  0.0  0.0      0     0 ?        S    12:29   0:00 [kthreadd]
root           3  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [rcu_gp]
root           4  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [rcu_par_gp]
root           5  0.0  0.0      0     0 ?        I    12:29   0:00  \_ [kworker/0:0-events]
root           6  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [kworker/0:0H-kblockd]
root           7  0.0  0.0      0     0 ?        I    12:29   0:00  \_ [kworker/0:1-events]
root           8  0.0  0.0      0     0 ?        I    12:29   0:00  \_ [kworker/u8:0-events_unbound]
root           9  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [mm_percpu_wq]
root          10  0.0  0.0      0     0 ?        S    12:29   0:00  \_ [ksoftirqd/0]
root          11  0.0  0.0      0     0 ?        I    12:29   0:00  \_ [rcu_sched]
root          12  0.0  0.0      0     0 ?        S    12:29   0:00  \_ [migration/0]
root          13  0.0  0.0      0     0 ?        S    12:29   0:00  \_ [idle_inject/0]
root          14  0.0  0.0      0     0 ?        S    12:29   0:00  \_ [cpuhp/0]
root          15  0.0  0.0      0     0 ?        S    12:29   0:00  \_ [cpuhp/1]
root          16  0.0  0.0      0     0 ?        S    12:29   0:00  \_ [idle_inject/1]
root          17  0.0  0.0      0     0 ?        S    12:29   0:00  \_ [migration/1]
root          18  0.0  0.0      0     0 ?        S    12:29   0:00  \_ [ksoftirqd/1]
root          19  0.0  0.0      0     0 ?        I    12:29   0:00  \_ [kworker/1:0-mm_percpu_wq]
root          20  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [kworker/1:0H-kblockd]
root          21  0.0  0.0      0     0 ?        S    12:29   0:00  \_ [cpuhp/2]
root          22  0.0  0.0      0     0 ?        S    12:29   0:00  \_ [idle_inject/2]
root          23  0.0  0.0      0     0 ?        S    12:29   0:00  \_ [migration/2]
root          24  0.0  0.0      0     0 ?        S    12:29   0:00  \_ [ksoftirqd/2]
root          25  0.0  0.0      0     0 ?        I    12:29   0:00  \_ [kworker/2:0-events]
root          26  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [kworker/2:0H-kblockd]
root          27  0.0  0.0      0     0 ?        S    12:29   0:00  \_ [cpuhp/3]
root          28  0.0  0.0      0     0 ?        S    12:29   0:00  \_ [idle_inject/3]
root          29  0.0  0.0      0     0 ?        S    12:29   0:00  \_ [migration/3]
root          30  0.1  0.0      0     0 ?        S    12:29   0:00  \_ [ksoftirqd/3]
root          31  0.0  0.0      0     0 ?        I    12:29   0:00  \_ [kworker/3:0-events]
root          32  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [kworker/3:0H-kblockd]
root          33  0.0  0.0      0     0 ?        S    12:29   0:00  \_ [kdevtmpfs]
root          34  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [netns]
root          35  0.0  0.0      0     0 ?        S    12:29   0:00  \_ [rcu_tasks_kthre]
root          36  0.0  0.0      0     0 ?        S    12:29   0:00  \_ [kauditd]
root          37  0.0  0.0      0     0 ?        S    12:29   0:00  \_ [khungtaskd]
root          38  0.0  0.0      0     0 ?        S    12:29   0:00  \_ [oom_reaper]
root          39  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [writeback]
root          40  0.0  0.0      0     0 ?        S    12:29   0:00  \_ [kcompactd0]
root          41  0.0  0.0      0     0 ?        SN   12:29   0:00  \_ [ksmd]
root          42  0.0  0.0      0     0 ?        SN   12:29   0:00  \_ [khugepaged]
root          44  0.0  0.0      0     0 ?        I    12:29   0:00  \_ [kworker/u8:1-events_unbound]
root          48  0.0  0.0      0     0 ?        I    12:29   0:00  \_ [kworker/3:1-memcg_kmem_cache]
root         135  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [kintegrityd]
root         136  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [kblockd]
root         137  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [blkcg_punt_bio]
root         138  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [tpm_dev_wq]
root         139  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [ata_sff]
root         140  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [md]
root         141  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [edac-poller]
root         142  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [devfreq_wq]
root         143  0.0  0.0      0     0 ?        S    12:29   0:00  \_ [watchdogd]
root         144  0.0  0.0      0     0 ?        I    12:29   0:00  \_ [kworker/2:1-events]
root         145  0.0  0.0      0     0 ?        I    12:29   0:00  \_ [kworker/1:1-cgroup_destroy]
root         148  0.0  0.0      0     0 ?        S    12:29   0:00  \_ [kswapd0]
root         149  0.0  0.0      0     0 ?        S    12:29   0:00  \_ [ecryptfs-kthrea]
root         152  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [kthrotld]
root         153  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [acpi_thermal_pm]
root         154  0.0  0.0      0     0 ?        S    12:29   0:00  \_ [scsi_eh_0]
root         155  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [scsi_tmf_0]
root         156  0.0  0.0      0     0 ?        S    12:29   0:00  \_ [scsi_eh_1]
root         157  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [scsi_tmf_1]
root         158  0.0  0.0      0     0 ?        I    12:29   0:00  \_ [kworker/u8:2-events_power_efficient]
root         159  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [vfio-irqfd-clea]
root         160  0.1  0.0      0     0 ?        I    12:29   0:01  \_ [kworker/u8:3-events_power_efficient]
root         161  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [ipv6_addrconf]
root         172  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [kstrp]
root         176  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [kworker/u9:0]
root         193  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [charger_manager]
root         194  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [kworker/0:1H-kblockd]
root         206  0.0  0.0      0     0 ?        I    12:29   0:00  \_ [kworker/0:2-events]
root         243  0.0  0.0      0     0 ?        I    12:29   0:00  \_ [kworker/2:2-events]
root         248  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [cryptd]
root         253  0.0  0.0      0     0 ?        I    12:29   0:00  \_ [kworker/2:3-events]
root         255  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [kworker/2:1H-kblockd]
root         293  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [ttm_swap]
root         295  0.0  0.0      0     0 ?        I    12:29   0:00  \_ [kworker/1:2-events]
root         296  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [kworker/1:1H-kblockd]
root         317  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [raid5wq]
root         357  0.0  0.0      0     0 ?        S    12:29   0:00  \_ [jbd2/vda2-8]
root         358  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [ext4-rsv-conver]
root         385  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [kworker/3:1H-kblockd]
root         389  0.0  0.0      0     0 ?        S    12:29   0:00  \_ [hwrng]
root         458  0.0  0.0      0     0 ?        I    12:29   0:00  \_ [kworker/0:3-mm_percpu_wq]
root         591  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [kaluad]
root         592  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [kmpath_rdacd]
root         593  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [kmpathd]
root         594  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [kmpath_handlerd]
root         604  0.0  0.0      0     0 ?        S<   12:29   0:00  \_ [loop0]
root         608  0.0  0.0      0     0 ?        S<   12:29   0:00  \_ [loop1]
root         609  0.0  0.0      0     0 ?        S<   12:29   0:00  \_ [loop2]
root         610  0.0  0.0      0     0 ?        S<   12:29   0:00  \_ [loop3]
root         611  0.0  0.0      0     0 ?        S<   12:29   0:00  \_ [loop4]
root         634  0.0  0.0      0     0 ?        I    12:29   0:00  \_ [kworker/1:3-events]
root         738  0.0  0.0      0     0 ?        I    12:29   0:00  \_ [kworker/2:4-events]
root         754  0.0  0.0      0     0 ?        I    12:29   0:00  \_ [kworker/2:5-events]
root         864  0.0  0.0      0     0 ?        I    12:29   0:00  \_ [kworker/1:4-events]
root        1167  0.0  0.0      0     0 ?        I<   12:29   0:00  \_ [dio/vda2]
root        1182  0.0  0.0   2488   588 ?        S    12:29   0:00  \_ bpfilter_umh
root        1209  0.0  0.0      0     0 ?        S<   12:29   0:00  \_ [spl_system_task]
root        1210  0.0  0.0      0     0 ?        S<   12:29   0:00  \_ [spl_delay_taskq]
root        1211  0.0  0.0      0     0 ?        S<   12:29   0:00  \_ [spl_dynamic_tas]
root        1212  0.0  0.0      0     0 ?        S<   12:29   0:00  \_ [spl_kmem_cache]
root        1213  0.0  0.0      0     0 ?        S<   12:29   0:00  \_ [zvol]
root        1214  0.0  0.0      0     0 ?        S    12:29   0:00  \_ [arc_prune]
root        1215  0.0  0.0      0     0 ?        SN   12:29   0:00  \_ [zthr_procedure]
root        1216  0.0  0.0      0     0 ?        SN   12:29   0:00  \_ [zthr_procedure]
root        1217  0.0  0.0      0     0 ?        S    12:29   0:00  \_ [dbu_evict]
root        1218  0.0  0.0      0     0 ?        SN   12:29   0:00  \_ [dbuf_evict]
root        1220  0.0  0.0      0     0 ?        SN   12:29   0:00  \_ [z_vdev_file]
root        1221  0.0  0.0      0     0 ?        S    12:29   0:00  \_ [l2arc_feed]
root        1532  0.0  0.0      0     0 ?        I    12:30   0:00  \_ [kworker/3:4-events]
root        2715  0.0  0.0      0     0 ?        I    12:33   0:00  \_ [kworker/1:5-events]
root        3227  0.0  0.0      0     0 ?        I    12:34   0:00  \_ [kworker/2:6-events]
root        3456  0.0  0.0      0     0 ?        I    12:34   0:00  \_ [kworker/0:4]
root           1  0.2  0.2 101828 11372 ?        Ss   12:29   0:01 /sbin/init maybe-ubiquity
root         428  0.1  0.6  84428 25336 ?        S<s  12:29   0:00 /lib/systemd/systemd-journald
root         457  0.1  0.1  21236  5408 ?        Ss   12:29   0:00 /lib/systemd/systemd-udevd
root         595  0.0  0.4 280140 17948 ?        SLsl 12:29   0:00 /sbin/multipathd -d -s
systemd+     630  0.0  0.1  90388  6232 ?        Ssl  12:29   0:00 /lib/systemd/systemd-timesyncd
systemd+     665  0.0  0.2  26868  8200 ?        Ss   12:29   0:00 /lib/systemd/systemd-networkd
systemd+     667  0.0  0.3  24044 12464 ?        Ss   12:29   0:00 /lib/systemd/systemd-resolved
root         681  0.0  0.1 235548  7488 ?        Ssl  12:29   0:00 /usr/lib/accountsservice/accounts-daemon
root         688  0.0  0.0   6812  2896 ?        Ss   12:29   0:00 /usr/sbin/cron -f
message+     689  0.0  0.1   7440  4552 ?        Ss   12:29   0:00 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only
root         697  0.0  0.0  81880  3660 ?        Ssl  12:29   0:00 /usr/sbin/irqbalance --foreground
root         698  0.0  0.4  29016 18264 ?        Ss   12:29   0:00 /usr/bin/python3 /usr/bin/networkd-dispatcher --run-startup-triggers
syslog       699  0.0  0.1 224324  4644 ?        Ssl  12:29   0:00 /usr/sbin/rsyslogd -n -iNONE
root         701  0.2  0.7 1014504 29980 ?       Ssl  12:29   0:01 /usr/lib/snapd/snapd
root         703  0.1  0.1  17000  8012 ?        Ss   12:29   0:00 /lib/systemd/systemd-logind
daemon       715  0.0  0.0   3792  2272 ?        Ss   12:29   0:00 /usr/sbin/atd -f
root         728  0.0  0.1  12160  7572 ?        Ss   12:29   0:00 sshd: /usr/sbin/sshd -D [listener] 0 of 10-100 startups
root         865  0.0  0.2  13768  8936 ?        Ss   12:29   0:00  \_ sshd: kuangniu [priv]
kuangniu     962  0.0  0.1  13900  5880 ?        S    12:29   0:00  |   \_ sshd: kuangniu@pts/0
kuangniu     963  0.0  0.1   8464  4344 pts/0    Ss   12:29   0:00  |       \_ -bash
kuangniu    4192  0.0  0.4 1390516 16616 pts/0   SLl+ 12:34   0:00  |           \_ lxc exec c1 -- bash
root        1992  0.0  0.2  13768  8712 ?        Ss   12:30   0:00  \_ sshd: kuangniu [priv]
kuangniu    2069  0.0  0.1  13900  5916 ?        S    12:30   0:00      \_ sshd: kuangniu@pts/1
kuangniu    2070  0.0  0.1   8464  5060 pts/1    Ss   12:30   0:00          \_ -bash
kuangniu    4733  0.0  0.0   9528  3960 pts/1    R+   12:38   0:00              \_ ps fauxww
root         758  0.0  0.0   5828  1884 tty1     Ss+  12:29   0:00 /sbin/agetty -o -p -- \u --noclear tty1 linux
root         775  0.0  0.1 232700  6940 ?        Ssl  12:29   0:00 /usr/lib/policykit-1/polkitd --no-debug
root         827  0.0  0.4 107836 19652 ?        Ssl  12:29   0:00 /usr/bin/python3 /usr/share/unattended-upgrades/unattended-upgrade-shutdown --wait-for-signal
kuangniu     868  0.0  0.2  18564  9940 ?        Ss   12:29   0:00 /lib/systemd/systemd --user
kuangniu     869  0.0  0.0 103164  3360 ?        S    12:29   0:00  \_ (sd-pam)
root        1006  0.0  0.0   4636  1732 ?        Ss   12:29   0:00 /bin/sh /snap/lxd/15223/commands/daemon.start
root        1140  1.3  1.6 1936480 66656 ?       SLl  12:29   0:07  \_ lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
lxd         1261  0.0  0.0  43628  3548 ?        Ss   12:29   0:00      \_ dnsmasq --keep-in-foreground --strict-order --bind-interfaces --except-interface=lo --no-ping --interface=lxdbr0 --quiet-dhcp --quiet-dhcp6 --quiet-ra --listen-address=10.3.242.1 --dhcp-no-override --dhcp-authoritative --dhcp-leasefile=/var/snap/lxd/common/lxd/networks/lxdbr0/dnsmasq.leases --dhcp-hostsfile=/var/snap/lxd/common/lxd/networks/lxdbr0/dnsmasq.hosts --dhcp-range 10.3.242.2,10.3.242.254,1h --listen-address=fd42:5088:4c58:554b::1 --enable-ra --dhcp-range ::,constructor:lxdbr0,ra-stateless,ra-names -s lxd -S /lxd/ --conf-file=/var/snap/lxd/common/lxd/networks/lxdbr0/dnsmasq.raw -u lxd
root        4228  0.0  0.1  85732  4964 ?        S    12:34   0:00      \_ /snap/lxd/current/bin/lxd forkexec c1 /var/snap/lxd/common/lxd/containers /var/snap/lxd/common/lxd/logs/c1/lxc.conf  0 0 -- env USER=root LANG=C.UTF-8 TERM=xterm-256color PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin HOME=/root -- cmd bash
1000000     4236  0.0  0.1   8960  4092 pts/1    Ss+  12:34   0:00          \_ bash
root        1129  0.0  0.0 237208  1772 ?        Sl   12:29   0:00 lxcfs /var/snap/lxd/common/var/lib/lxcfs -p /var/snap/lxd/common/lxcfs.pid
root        2956  0.0  0.4 1307620 16824 ?       Ss   12:34   0:00 [lxc monitor] /var/snap/lxd/common/lxd/containers c1
1000000     2971  0.2  0.3 170228 12624 ?        Ss   12:34   0:00  \_ /sbin/init
1000000     3046  0.2  0.2  35236 12012 ?        Ss   12:34   0:00      \_ /lib/systemd/systemd-journald
1000000     3083  0.0  0.1  21588  5064 ?        Ss   12:34   0:00      \_ /lib/systemd/systemd-udevd
1000100     3139  0.1  0.1  26740  6960 ?        Ss   12:34   0:00      \_ /lib/systemd/systemd-networkd
1000101     3142  0.1  0.3  24112 13004 ?        Ss   12:34   0:00      \_ /lib/systemd/systemd-resolved
1000000     3180  0.0  0.2 241016  9608 ?        Ssl  12:34   0:00      \_ /usr/lib/accountsservice/accounts-daemon
1000000     3183  0.0  0.0   8536  2932 ?        Ss   12:34   0:00      \_ /usr/sbin/cron -f
1000103     3184  0.0  0.1   7376  4548 ?        Ss   12:34   0:00      \_ /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only
1000000     3191  0.0  0.4  29216 17968 ?        Ss   12:34   0:00      \_ /usr/bin/python3 /usr/bin/networkd-dispatcher --run-startup-triggers
1000104     3192  0.0  0.1 154840  4508 ?        Ssl  12:34   0:00      \_ /usr/sbin/rsyslogd -n -iNONE
1000000     3195  0.1  0.1  16632  6644 ?        Ss   12:34   0:00      \_ /lib/systemd/systemd-logind
1000001     3197  0.0  0.0   3792  2504 ?        Ss   12:34   0:00      \_ /usr/sbin/atd -f
1000000     3215  0.0  0.1  12160  7216 ?        Ss   12:34   0:00      \_ sshd: /usr/sbin/sshd -D [listener] 0 of 10-100 startups
1000000     3219  0.0  0.0   7352  2252 pts/0    Ss+  12:34   0:00      \_ /sbin/agetty -o -p -- \u --noclear --keep-baud console 115200,38400,9600 linux
1000000     3228  0.0  0.2 236404  9272 ?        Ssl  12:34   0:00      \_ /usr/lib/policykit-1/polkitd --no-debug
1000000     3272  0.0  0.5 108036 21044 ?        Ssl  12:34   0:00      \_ /usr/bin/python3 /usr/share/unattended-upgrades/unattended-upgrade-shutdown --wait-for-signal
1000000     3438  0.1  0.0   3396  1824 ?        Ss   12:34   0:00      \_ snapfuse /var/lib/snapd/snaps/snapd_7264.snap /snap/snapd/7264 -o ro,nodev,allow_other,suid
1000000     3459  0.5  0.7 1309660 30544 ?       Ssl  12:34   0:01      \_ /usr/lib/snapd/snapd
1000000     3577  0.7  0.0   3764  2056 ?        Ss   12:34   0:01      \_ snapfuse /var/lib/snapd/snaps/core18_1754.snap /snap/core18/1754 -o ro,nodev,allow_other,suid
1000000     3660  1.4  0.0   3640  2056 ?        Ss   12:34   0:03      \_ snapfuse /var/lib/snapd/snaps/lxd_15223.snap /snap/lxd/15223 -o ro,nodev,allow_other,suid
1000000     4281  0.0  0.0   4636  1904 ?        Ss   12:34   0:00      \_ /bin/sh /snap/lxd/15223/commands/daemon.start
1000000     4416 20.0  2.2 1791340 91464 ?       SLl  12:34   0:42      |   \_ lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
1000998     4511  0.2  0.0  45348  3680 ?        Ss   12:35   0:00      |       \_ dnsmasq --keep-in-foreground --strict-order --bind-interfaces --except-interface=lo --no-ping --interface=lxdbr0 --quiet-dhcp --quiet-dhcp6 --quiet-ra --listen-address=10.23.167.1 --dhcp-no-override --dhcp-authoritative --dhcp-leasefile=/var/snap/lxd/common/lxd/networks/lxdbr0/dnsmasq.leases --dhcp-hostsfile=/var/snap/lxd/common/lxd/networks/lxdbr0/dnsmasq.hosts --dhcp-range 10.23.167.2,10.23.167.254,1h --listen-address=fd42:bcfd:15e4:8407::1 --enable-ra --dhcp-range ::,constructor:lxdbr0,ra-stateless,ra-names -s lxd -S /lxd/ --conf-file=/var/snap/lxd/common/lxd/networks/lxdbr0/dnsmasq.raw -u lxd
1000000     4405  0.0  0.0  97808  1668 ?        Sl   12:34   0:00      \_ lxcfs /var/snap/lxd/common/var/lib/lxcfs -p /var/snap/lxd/common/lxcfs.pid
kuangniu commented 4 years ago

So I tried to boot 2nd level container (ubuntu 20.04) for the first time, to see how I can walk this around. Turned out, it seems a lot more serious than I thought. 2nd container wouldn't even boot up properly (HOST:Ubuntu 20.04 -- 1st Container:Ubuntu 20.04 -- 2nd Container:Ubuntu 20.04). They work perfectly in combination of HOST:Ubuntu 20.04 -- 1st Container:Ubuntu 18.04 -- 2nd Container:Ubuntu 18.04.

NG Case: HOST:Ubuntu 20.04 -- 1st Container:Ubuntu 20.04 -- 2nd Container:Ubuntu 20.04**

kuangniu@focal:~$ lxc init ubuntu:20.04 f1
Creating f1
kuangniu@focal:~$ lxc config set f1 security.nesting true
kuangniu@focal:~$ lxc config set f1 user.user-data - <<EOF
> #cloud-config
> locale: en_US.UTF-8
> timezone: US/Pacific
> EOF
kuangniu@focal:~$ lxc start f1
kuangniu@focal:~$ lxc exec f1 -- bash
root@c1:~# lxd init
...(just press enter everything)
root@c1:~# lxc init ubuntu:20.04 f2
...(don't try to set user.user-data because it fails)
root@c1:~# lxc start f2
...(10 minutes later)
root@c1:~# lxc exec f2 -- systemctl is-system-running
starting
...(systemd never finishes startup)

syslog of the 2nd container attached: focal-focal-focal-syslog.txt

OK Case: HOST:Ubuntu 20.04 -- 1st Container:Ubuntu 18.04 -- 2nd Container:Ubuntu 18.04**

NOTE: In this case, lxc ACCEPTS stdin

kuangniu@focal:~$ lxc init ubuntu:18.04 b1
Creating b1
kuangniu@focal:~$ lxc config set b1 security.nesting true
kuangniu@focal:~$ lxc config set b1 user.user-data - <<EOF
> #cloud-config
> locale: en_US.UTF-8
> timezone: US/Pacific
> EOF
kuangniu@focal:~$ lxc start b1
kuangniu@focal:~$ lxc exec b1 -- bash
root@b1:~# lxd init
...(just press enter everything)
root@b1:~# lxc init ubuntu:18.04 f2
root@b1:~# lxc config set b1 user.user-data - <<EOF
> #cloud-config
> locale: en_US.UTF-8
> timezone: US/Pacific
> EOF
root@b1:~# lxc start f2
...(10 seconds later)
root@c1:~# lxc exec f2 -- systemctl is-system-running
degraded
...(degraded because some units fails during container boot; startup finished)

I'm not talking about one in a million case senario. I'm using dedicated test (virtual) machine built from scratch only for this issue, with standard Ubuntu installer with strait options, standard Ubuntu container image from official repository. It must affect anybody who try to run nested Focal containers on Focal host. But I also wonder why it hadn't made much fuss already if it was. Am I missing something?

stgraber commented 4 years ago

Ok, so tried to reproduce your problem again:

All seems to be working fine here and matches what we're seeing in our CI too (where we test such nesting daily).

stgraber@castiana:~$ lxc launch images:ubuntu/20.04/cloud my-vm --vm
Creating my-vm
Starting my-vm
stgraber@castiana:~$ lxc exec my-vm bash
root@my-vm:~# apt install snapd
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following additional packages will be installed:
  apparmor liblzo2-2 squashfs-tools
Suggested packages:
  apparmor-profiles-extra apparmor-utils zenity | kdialog
The following NEW packages will be installed:
  apparmor liblzo2-2 snapd squashfs-tools
0 upgraded, 4 newly installed, 0 to remove and 0 not upgraded.
Need to get 23.9 MB of archives.
After this operation, 107 MB of additional disk space will be used.
Do you want to continue? [Y/n] 
Get:1 http://us.archive.ubuntu.com/ubuntu focal-updates/main amd64 apparmor amd64 2.13.3-7ubuntu5.1 [494 kB]
Get:2 http://us.archive.ubuntu.com/ubuntu focal/main amd64 liblzo2-2 amd64 2.10-2 [50.8 kB]
Get:3 http://us.archive.ubuntu.com/ubuntu focal/main amd64 squashfs-tools amd64 1:4.4-1 [121 kB]
Get:4 http://us.archive.ubuntu.com/ubuntu focal/main amd64 snapd amd64 2.44.3+20.04 [23.2 MB]
Fetched 23.9 MB in 1s (38.8 MB/s)
Preconfiguring packages ...
Selecting previously unselected package apparmor.
(Reading database ... 49774 files and directories currently installed.)
Preparing to unpack .../apparmor_2.13.3-7ubuntu5.1_amd64.deb ...
Unpacking apparmor (2.13.3-7ubuntu5.1) ...
Selecting previously unselected package liblzo2-2:amd64.
Preparing to unpack .../liblzo2-2_2.10-2_amd64.deb ...
Unpacking liblzo2-2:amd64 (2.10-2) ...
Selecting previously unselected package squashfs-tools.
Preparing to unpack .../squashfs-tools_1%3a4.4-1_amd64.deb ...
Unpacking squashfs-tools (1:4.4-1) ...
Selecting previously unselected package snapd.
Preparing to unpack .../snapd_2.44.3+20.04_amd64.deb ...
Unpacking snapd (2.44.3+20.04) ...
Setting up liblzo2-2:amd64 (2.10-2) ...
Setting up apparmor (2.13.3-7ubuntu5.1) ...
Created symlink /etc/systemd/system/sysinit.target.wants/apparmor.service → /lib/systemd/system/apparmor.service.
Reloading AppArmor profiles 
Skipping profile in /etc/apparmor.d/disable: usr.sbin.rsyslogd
Setting up squashfs-tools (1:4.4-1) ...
Setting up snapd (2.44.3+20.04) ...
Created symlink /etc/systemd/system/multi-user.target.wants/snapd.apparmor.service → /lib/systemd/system/snapd.apparmor.service.
Created symlink /etc/systemd/system/multi-user.target.wants/snapd.autoimport.service → /lib/systemd/system/snapd.autoimport.service.
Created symlink /etc/systemd/system/multi-user.target.wants/snapd.core-fixup.service → /lib/systemd/system/snapd.core-fixup.service.
Created symlink /etc/systemd/system/multi-user.target.wants/snapd.recovery-chooser-trigger.service → /lib/systemd/system/snapd.recovery-chooser-trigger.service
.
Created symlink /etc/systemd/system/multi-user.target.wants/snapd.seeded.service → /lib/systemd/system/snapd.seeded.service.
Created symlink /etc/systemd/system/cloud-final.service.wants/snapd.seeded.service → /lib/systemd/system/snapd.seeded.service.
Created symlink /etc/systemd/system/multi-user.target.wants/snapd.service → /lib/systemd/system/snapd.service.
Created symlink /etc/systemd/system/timers.target.wants/snapd.snap-repair.timer → /lib/systemd/system/snapd.snap-repair.timer.
Created symlink /etc/systemd/system/sockets.target.wants/snapd.socket → /lib/systemd/system/snapd.socket.
Created symlink /etc/systemd/system/final.target.wants/snapd.system-shutdown.service → /lib/systemd/system/snapd.system-shutdown.service.
snapd.failure.service is a disabled or a static unit, not starting it.
snapd.snap-repair.service is a disabled or a static unit, not starting it.
Processing triggers for libc-bin (2.31-0ubuntu9) ...
Processing triggers for systemd (245.4-4ubuntu3.1) ...
Processing triggers for mime-support (3.64ubuntu1) ...
root@my-vm:~# snap install lxd
2020-07-01T20:51:04Z INFO Waiting for restart...
Warning: /snap/bin was not found in your $PATH. If you've not restarted your session since you
         installed snapd, try doing that. Please see https://forum.snapcraft.io/t/9469 for more
         details.

lxd 4.2 from Canonical✓ installed
root@my-vm:~# exit
stgraber@castiana:~$ lxc exec my-vm bash
root@my-vm:~# lxd init
Would you like to use LXD clustering? (yes/no) [default=no]: 
Do you want to configure a new storage pool? (yes/no) [default=yes]: 
Name of the new storage pool [default=default]: 
Name of the storage backend to use (dir, lvm, zfs, ceph, btrfs) [default=zfs]: 
Create a new ZFS pool? (yes/no) [default=yes]: 
Would you like to use an existing empty disk or partition? (yes/no) [default=no]: 
Size in GB of the new loop device (1GB minimum) [default=5GB]: 
Would you like to connect to a MAAS server? (yes/no) [default=no]: 
Would you like to create a new local network bridge? (yes/no) [default=yes]: 
What should the new bridge be called? [default=lxdbr0]: 
What IPv4 address should be used? (CIDR subnet notation, “auto” or “none”) [default=auto]: 
What IPv6 address should be used? (CIDR subnet notation, “auto” or “none”) [default=auto]: 
Would you like LXD to be available over the network? (yes/no) [default=no]: 
Would you like stale cached images to be updated automatically? (yes/no) [default=yes] 
Would you like a YAML "lxd init" preseed to be printed? (yes/no) [default=no]: 
root@my-vm:~# lxc init ubuntu:20.04 f1
Creating f1
root@my-vm:~# lxc config set f1 security.nesting true
root@my-vm:~# lxc config edit f1
root@my-vm:~# lxc start f1
root@my-vm:~# lxc exec f1 bash
root@f1:~# lxd init
Would you like to use LXD clustering? (yes/no) [default=no]: 
Do you want to configure a new storage pool? (yes/no) [default=yes]: 
Name of the new storage pool [default=default]: 
Would you like to connect to a MAAS server? (yes/no) [default=no]: 
Would you like to create a new local network bridge? (yes/no) [default=yes]: 
What should the new bridge be called? [default=lxdbr0]: 
What IPv4 address should be used? (CIDR subnet notation, “auto” or “none”) [default=auto]: 
What IPv6 address should be used? (CIDR subnet notation, “auto” or “none”) [default=auto]: 
Would you like LXD to be available over the network? (yes/no) [default=no]: 
Would you like stale cached images to be updated automatically? (yes/no) [default=yes] 
Would you like a YAML "lxd init" preseed to be printed? (yes/no) [default=no]: 
root@f1:~# 
root@f1:~# 
root@f1:~# lxc init ubuntu:20.04 f2
Creating f2
root@f1:~# lxc config edit f2                 
root@f1:~# time lxc start f2

real    0m5.904s
user    0m0.037s
sys 0m0.059s
root@f1:~# lxc exec f2 -- systemctl is-system-running
starting
root@f1:~# lxc exec f2 -- systemctl --failed
  UNIT                        LOAD   ACTIVE SUB    DESCRIPTION                           
● apparmor.service            loaded failed failed Load AppArmor profiles                
● networkd-dispatcher.service loaded failed failed Dispatcher daemon for systemd-networkd
● systemd-remount-fs.service  loaded failed failed Remount Root and Kernel File Systems  

LOAD   = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB    = The low-level unit activation state, values depend on unit type.

3 loaded units listed.
root@f1:~# 
stgraber commented 4 years ago

Testing reading from stdin specifically:

root@my-vm:~# echo bar | lxc config set f1 user.foo -
root@my-vm:~# lxc config get f1 user.foo
bar

root@my-vm:~# lxc exec f1 bash
root@f1:~# echo bar | lxc config set f2 user.foo -
root@f1:~# lxc config get f2 user.foo
bar

root@f1:~# 
stgraber commented 4 years ago

Note that f2 will not be able to run the LXD snap itself, that part is normal and workarounds for it are described here: https://discuss.linuxcontainers.org/t/nested-containers-issues-permissions-zfs-possibly-something-else/8240/9