canonical / lxd

Powerful system container and virtual machine manager
https://canonical.com/lxd
GNU Affero General Public License v3.0
4.32k stars 925 forks source link

Can't launch any container #10426

Closed bazuchan closed 2 years ago

bazuchan commented 2 years ago

Required information

% lxc info --show-log local:ge
Name: ge
Status: STOPPED
Type: container
Architecture: x86_64
Created: 2022/05/17 10:50 MSK
Last Used: 2022/05/17 10:50 MSK

Log:

lxc ge 20220517075017.468 WARN     conf - conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc ge 20220517075017.468 WARN     conf - conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc ge 20220517075017.469 WARN     conf - conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc ge 20220517075017.469 WARN     conf - conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc ge 20220517075017.636 ERROR    start - start.c:start:2164 - No such file or directory - Failed to exec "/sbin/init"
lxc ge 20220517075017.636 ERROR    sync - sync.c:sync_wait:34 - An error occurred in another process (expected sequence number 7)
lxc ge 20220517075017.647 WARN     network - network.c:lxc_delete_network_priv:3617 - Failed to rename interface with index 0 from "eth0" to its initial name "veth8f78f55c"
lxc ge 20220517075017.647 ERROR    lxccontainer - lxccontainer.c:wait_on_daemonized_start:877 - Received container state "ABORTING" instead of "RUNNING"
lxc ge 20220517075017.647 ERROR    start - start.c:__lxc_start:2074 - Failed to spawn container "ge"
lxc ge 20220517075017.647 WARN     start - start.c:lxc_abort:1039 - No such process - Failed to send SIGKILL via pidfd 17 for process 1466589
lxc ge 20220517075022.736 WARN     conf - conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc ge 20220517075022.736 WARN     conf - conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc 20220517075022.774 ERROR    af_unix - af_unix.c:lxc_abstract_unix_recv_fds_iov:218 - Connection reset by peer - Failed to receive response
lxc 20220517075022.775 ERROR    commands - commands.c:lxc_cmd_rsp_recv_fds:127 - Failed to receive file descriptors for command "get_state"

Installed from snap stable.

# snap list lxd
Name  Version      Rev    Tracking       Publisher   Notes
lxd   5.1-1f6f485  23037  latest/stable  canonical✓  -
stgraber commented 2 years ago

Can you try another image?

bazuchan commented 2 years ago

I have tried jammy, focal, bionic and debian buster and got same error.

stgraber commented 2 years ago

Can you show:

bazuchan commented 2 years ago
% lxc launch images:ubuntu/jammy/amd64 ge1
Creating ge1
Starting ge1                              
Error: Failed to run: /snap/lxd/current/bin/lxd forkstart ge1 /var/snap/lxd/common/lxd/containers /var/snap/lxd/common/lxd/logs/ge1/lxc.conf: 
Try `lxc info --show-log local:ge1` for more info

# ls -lh /var/snap/lxd/common/lxd/storage-pools/default/containers/ge1
total 8,0K
-r-------- 1 root root 2,8K мая 18 10:31 backup.yaml
drwxr-xr-x 5 root root 4,0K мая 18 10:31 rootfs

# ls -lh /var/snap/lxd/common/lxd/storage-pools/default/containers/ge1/rootfs
total 12K
drw-r--r-- 2 root root 4,0K мая 18 10:31 dev
drwxr-xr-x 2 root root 4,0K мая 18 10:31 proc
drwxr-xr-x 2 root root 4,0K мая 18 10:31 sys

# ls -lh /var/snap/lxd/common/lxd/storage-pools/default/containers/ge1/rootfs/sbin
ls: cannot access '/var/snap/lxd/common/lxd/storage-pools/default/containers/ge1/rootfs/sbin': No such file or directory
tomponline commented 2 years ago

OK so looks like your images are not being unpacked correctly.

Please can you rule out any issues with problematic cached images by doing lxc image list, finding the entry of the image you are trying to launch and then doing lxc image delete <fingerprint> for that image.

Than try launching the instance again and this should trigger a fresh image be downloaded.

If you still have issues, please can you run sudo du -h /var/snap/lxd/common/lxd/storage-pools/default/containers/ge1/ so we can see how much is inside the image.

And please also look for any AppArmor denials in your log, using journalctl -b | grep DENIED

bazuchan commented 2 years ago

Looks I found root of the problem. I have

# ls -la /var/snap/lxd/common/lxd/storage-pools
lrwxrwxrwx 1 root root 9 мар 13  2018 /var/snap/lxd/common/lxd/storage-pools -> /home/lxd

and it breaks image unpacking now. Coping container from another lxc host works fine btw.

tomponline commented 2 years ago

Hrm, that is highly unusual agreed. What makes you think this is the cause?

I believe @stgraber doesn't recommend using that approach (see this recent post https://discuss.linuxcontainers.org/t/lxd-broken-after-setting-custom-backups-storage/14102/6?u=tomp).

Have you considered using a bind mount instead of a symlink?

bazuchan commented 2 years ago

Hrm, that is highly unusual agreed. What makes you think this is the cause?

I have created new storage pool in default location and have no problem launching with new pool.

I believe @stgraber doesn't recommend using that approach (see this recent post https://discuss.linuxcontainers.org/t/lxd-broken-after-setting-custom-backups-storage/14102/6?u=tomp).

Have you considered using a bind mount instead of a symlink?

I'm moving instances to new pool, so my problem is solved. You can close issue if it is expected behavior.