canonical / lxd

Powerful system container and virtual machine manager
https://canonical.com/lxd
GNU Affero General Public License v3.0
4.38k stars 931 forks source link

containers aren't getting an IPV4 address on startup (but ipv6 works) #9356

Closed Mohamedemad4 closed 3 years ago

Mohamedemad4 commented 3 years ago

Issue description

The lxd instance has been working normally for the past 3 months or so. but today it not allocating ipv4 addresses to containers when they start (ipv6 works fine!!) I have verified that dnsmasq is working on the host. And i am out of ways to debug this issue.

Steps to reproduce

Here are some more outputs that might be relevant to the issue

Required information

the output of lxc network info lxdbr0

Name: lxdbr0
MAC address: 00:16:3e:26:44:53
MTU: 1500
State: up

Ips:
  inet  10.222.79.1
  inet6 fd42:c50e:d848:412c::1
  inet6 fe80::216:3eff:fe26:4453

Network usage:
  Bytes received: 3.07kB
  Bytes sent: 4.20kB
  Packets received: 44
  Packets sent: 33

this message in the dmesg log relating to lxdbr0

[  117.647068] lxdbr0: port 1(vethde3b95aa) entered blocking state
[  117.647071] lxdbr0: port 1(vethde3b95aa) entered disabled state
[  117.647232] lxdbr0: port 1(vethde3b95aa) entered blocking state
[  117.647233] lxdbr0: port 1(vethde3b95aa) entered forwarding state
[  117.648498] lxdbr0: port 1(vethde3b95aa) entered disabled state
[  118.001315] lxdbr0: port 1(vethde3b95aa) entered blocking state
[  118.001317] lxdbr0: port 1(vethde3b95aa) entered forwarding state
[  118.001364] IPv6: ADDRCONF(NETDEV_CHANGE): lxdbr0: link becomes ready
[  294.674822] lxdbr0: port 2(veth25caf75a) entered blocking state
[  294.674825] lxdbr0: port 2(veth25caf75a) entered disabled state
[  297.165233] lxdbr0: port 2(veth25caf75a) entered blocking state
[  297.165236] lxdbr0: port 2(veth25caf75a) entered forwarding state
[  316.779283] lxdbr0: port 2(veth25caf75a) entered disabled state
[  316.927174] lxdbr0: port 2(veth25caf75a) entered disabled state
[ 2689.426371] lxdbr0: port 2(veth85d20ac8) entered blocking state
[ 2689.426373] lxdbr0: port 2(veth85d20ac8) entered disabled state
[ 2689.426568] lxdbr0: port 2(veth85d20ac8) entered blocking state
[ 2689.426570] lxdbr0: port 2(veth85d20ac8) entered forwarding state
[ 2690.214541] lxdbr0: port 2(veth85d20ac8) entered disabled state
[ 2865.209422] lxdbr0: port 2(veth85d20ac8) entered disabled state
[ 2872.382213] lxdbr0: port 2(vethf7312a81) entered blocking state
[ 2872.382216] lxdbr0: port 2(vethf7312a81) entered disabled state
[ 2878.031809] lxdbr0: port 2(vethf7312a81) entered blocking state
[ 2878.031811] lxdbr0: port 2(vethf7312a81) entered forwarding state
[ 3237.453632] lxdbr0: port 1(vethde3b95aa) entered disabled state
[ 3237.560021] lxdbr0: port 1(vethde3b95aa) entered disabled state

the main lxd deamon log

t=2021-10-06T19:01:05+0000 lvl=info msg="LXD 4.0.7 is starting in normal mode" path=/var/snap/lxd/common/lxd
t=2021-10-06T19:01:05+0000 lvl=info msg="Kernel uid/gid map:" 
t=2021-10-06T19:01:05+0000 lvl=info msg=" - u 0 0 4294967295" 
t=2021-10-06T19:01:05+0000 lvl=info msg=" - g 0 0 4294967295" 
t=2021-10-06T19:01:05+0000 lvl=info msg="Configured LXD uid/gid map:" 
t=2021-10-06T19:01:05+0000 lvl=info msg=" - u 0 1000000 1000000000" 
t=2021-10-06T19:01:05+0000 lvl=info msg=" - g 0 1000000 1000000000" 
t=2021-10-06T19:01:05+0000 lvl=info msg="Kernel features:" 
t=2021-10-06T19:01:05+0000 lvl=info msg=" - closing multiple file descriptors efficiently: no" 
t=2021-10-06T19:01:05+0000 lvl=info msg=" - netnsid-based network retrieval: yes" 
t=2021-10-06T19:01:05+0000 lvl=info msg=" - pidfds: yes" 
t=2021-10-06T19:01:05+0000 lvl=info msg=" - uevent injection: yes" 
t=2021-10-06T19:01:05+0000 lvl=info msg=" - seccomp listener: yes" 
t=2021-10-06T19:01:05+0000 lvl=info msg=" - seccomp listener continue syscalls: yes" 
t=2021-10-06T19:01:05+0000 lvl=info msg=" - seccomp listener add file descriptors: no" 
t=2021-10-06T19:01:05+0000 lvl=info msg=" - attach to namespaces via pidfds: no" 
t=2021-10-06T19:01:05+0000 lvl=info msg=" - safe native terminal allocation : yes" 
t=2021-10-06T19:01:05+0000 lvl=info msg=" - unprivileged file capabilities: yes" 
t=2021-10-06T19:01:05+0000 lvl=info msg=" - cgroup layout: hybrid" 
t=2021-10-06T19:01:05+0000 lvl=warn msg=" - Couldn't find the CGroup blkio.weight, disk priority will be ignored" 
t=2021-10-06T19:01:05+0000 lvl=warn msg=" - Couldn't find the CGroup memory swap accounting, swap limits will be ignored" 
t=2021-10-06T19:01:05+0000 lvl=info msg=" - shiftfs support: disabled" 
t=2021-10-06T19:01:05+0000 lvl=warn msg="Instance type not operational" driver=qemu err="KVM support is missing" type=virtual-machine
t=2021-10-06T19:01:05+0000 lvl=info msg="Initializing local database" 
t=2021-10-06T19:01:05+0000 lvl=info msg="Set client certificate to server certificate 20109e488a8a3f06de230880d2b851d1d2bb314d995ec826e9ce96a3d3755d35" 
t=2021-10-06T19:01:06+0000 lvl=info msg="Starting /dev/lxd handler:" 
t=2021-10-06T19:01:06+0000 lvl=info msg=" - binding devlxd socket" socket=/var/snap/lxd/common/lxd/devlxd/sock
t=2021-10-06T19:01:06+0000 lvl=info msg="REST API daemon:" 
t=2021-10-06T19:01:06+0000 lvl=info msg=" - binding Unix socket" inherited=true socket=/var/snap/lxd/common/lxd/unix.socket
t=2021-10-06T19:01:06+0000 lvl=info msg=" - binding TCP socket" socket=[::]:8443
t=2021-10-06T19:01:06+0000 lvl=info msg="Initializing global database" 
t=2021-10-06T19:01:06+0000 lvl=info msg="Firewall loaded driver \"xtables\"" 
t=2021-10-06T19:01:06+0000 lvl=info msg="Initializing storage pools" 
t=2021-10-06T19:01:08+0000 lvl=info msg="Initializing daemon storage mounts" 
t=2021-10-06T19:01:08+0000 lvl=info msg="Initializing networks" 
t=2021-10-06T19:01:09+0000 lvl=info msg="Pruning leftover image files" 
t=2021-10-06T19:01:09+0000 lvl=info msg="Done pruning leftover image files" 
t=2021-10-06T19:01:09+0000 lvl=info msg="Loading daemon configuration" 
t=2021-10-06T19:01:09+0000 lvl=info msg="Started seccomp handler" path=/var/snap/lxd/common/lxd/seccomp.socket
t=2021-10-06T19:01:09+0000 lvl=info msg="Pruning expired images" 
t=2021-10-06T19:01:09+0000 lvl=info msg="Done pruning expired images" 
t=2021-10-06T19:01:09+0000 lvl=info msg="Pruning expired instance backups" 
t=2021-10-06T19:01:09+0000 lvl=info msg="Done pruning expired instance backups" 
t=2021-10-06T19:01:09+0000 lvl=info msg="Updating images" 
t=2021-10-06T19:01:09+0000 lvl=info msg="Done updating images" 
t=2021-10-06T19:01:09+0000 lvl=info msg="Expiring log files" 
t=2021-10-06T19:01:09+0000 lvl=info msg="Done expiring log files" 
t=2021-10-06T19:01:09+0000 lvl=info msg="Updating instance types" 
t=2021-10-06T19:01:09+0000 lvl=info msg="Done updating instance types" 
t=2021-10-06T19:01:10+0000 lvl=info msg="Starting container" action=start created=2021-09-21T17:54:35+0000 ephemeral=false instance=container-h9mb-rtpo-nsdr instanceType=container project=default stateful=false used=2021-10-06T18:56:58+0000
t=2021-10-06T19:01:10+0000 lvl=info msg="Started container" action=start created=2021-09-21T17:54:35+0000 ephemeral=false instance=container-h9mb-rtpo-nsdr instanceType=container project=default stateful=false used=2021-10-06T18:56:58+0000
t=2021-10-06T19:04:09+0000 lvl=info msg="Starting container" action=start created=2021-10-01T16:02:07+0000 ephemeral=false instance=container-a8c0-es25-jpck instanceType=container project=default stateful=false used=2021-10-06T18:57:34+0000
t=2021-10-06T19:04:13+0000 lvl=info msg="Started container" action=start created=2021-10-01T16:02:07+0000 ephemeral=false instance=container-a8c0-es25-jpck instanceType=container project=default stateful=false used=2021-10-06T18:57:34+0000
t=2021-10-06T19:04:28+0000 lvl=info msg="Stopping container" action=stop created=2021-10-01T16:02:07+0000 ephemeral=false instance=container-a8c0-es25-jpck instanceType=container project=default stateful=false used=2021-10-06T19:04:09+0000
t=2021-10-06T19:04:31+0000 lvl=info msg="Stopped container" action=stop created=2021-10-01T16:02:07+0000 ephemeral=false instance=container-a8c0-es25-jpck instanceType=container project=default stateful=false used=2021-10-06T19:04:09+0000
t=2021-10-06T19:18:44+0000 lvl=info msg="Shutting down container" action=shutdown created=2021-09-21T17:54:35+0000 ephemeral=false instance=container-h9mb-rtpo-nsdr instanceType=container project=default timeout=-1s used=2021-10-06T19:01:10+0000
t=2021-10-06T19:20:42+0000 lvl=info msg="Starting container" action=start created=2021-10-01T16:02:07+0000 ephemeral=false instance=container-a8c0-es25-jpck instanceType=container project=default stateful=false used=2021-10-06T19:04:09+0000
t=2021-10-06T19:20:49+0000 lvl=info msg="Started container" action=start created=2021-10-01T16:02:07+0000 ephemeral=false instance=container-a8c0-es25-jpck instanceType=container project=default stateful=false used=2021-10-06T19:04:09+0000
t=2021-10-06T19:44:19+0000 lvl=info msg="Restarting container" action=shutdown created=2021-10-01T16:02:07+0000 ephemeral=false instance=container-a8c0-es25-jpck instanceType=container project=default timeout=-1ns used=2021-10-06T19:20:43+0000
t=2021-10-06T19:46:57+0000 lvl=info msg="Stopping container" action=stop created=2021-10-01T16:02:07+0000 ephemeral=false instance=container-a8c0-es25-jpck instanceType=container project=default stateful=false used=2021-10-06T19:20:43+0000
t=2021-10-06T19:47:00+0000 lvl=info msg="Stopped container" action=stop created=2021-10-01T16:02:07+0000 ephemeral=false instance=container-a8c0-es25-jpck instanceType=container project=default stateful=false used=2021-10-06T19:20:43+0000
t=2021-10-06T19:47:10+0000 lvl=info msg="Starting container" action=start created=2021-10-01T16:02:07+0000 ephemeral=false instance=container-a8c0-es25-jpck instanceType=container project=default stateful=false used=2021-10-06T19:20:43+0000
t=2021-10-06T19:47:16+0000 lvl=info msg="Started container" action=start created=2021-10-01T16:02:07+0000 ephemeral=false instance=container-a8c0-es25-jpck instanceType=container project=default stateful=false used=2021-10-06T19:20:43+0000
t=2021-10-06T19:51:53+0000 lvl=eror msg="Failed to update the image" err="Failed to create image \"b858d693a55fcf9b1a656fd13d577104fd1dc6f2f1481f5ca35a08ec25d28a6c\" on storage pool \"z_container_pool\": Unable to unpack image, run out of disk space" fingerprint=56296ba81a6fb502c634697a840d7957c3d2aa1a1805820e605ed21475058851
t=2021-10-06T19:51:53+0000 lvl=info msg="Creating container" ephemeral=false instance=s instanceType=container project=default
t=2021-10-06T19:51:53+0000 lvl=info msg="Created container" ephemeral=false instance=s instanceType=container project=default
t=2021-10-06T19:53:10+0000 lvl=info msg="Shut down container" action=shutdown created=2021-09-21T17:54:35+0000 ephemeral=false instance=container-h9mb-rtpo-nsdr instanceType=container project=default timeout=-1s used=2021-10-06T19:01:10+0000
t=2021-10-06T19:55:06+0000 lvl=eror msg="Error getting disk usage" err="Failed to run: zfs get -H -p -o value used z_container_pool/containers/s: cannot open 'z_container_pool/containers/s': dataset does not exist" instance=s instanceType=container project=default
t=2021-10-06T20:01:09+0000 lvl=info msg="Pruning expired instance backups" 
t=2021-10-06T20:01:09+0000 lvl=info msg="Done pruning expired instance backups" 
root@eu-west6-zu-a-host:~# cat  /var/snap/lxd/common/lxd/logs/lxd.log 
t=2021-10-06T19:01:05+0000 lvl=info msg="LXD 4.0.7 is starting in normal mode" path=/var/snap/lxd/common/lxd
t=2021-10-06T19:01:05+0000 lvl=info msg="Kernel uid/gid map:" 
t=2021-10-06T19:01:05+0000 lvl=info msg=" - u 0 0 4294967295" 
t=2021-10-06T19:01:05+0000 lvl=info msg=" - g 0 0 4294967295" 
t=2021-10-06T19:01:05+0000 lvl=info msg="Configured LXD uid/gid map:" 
t=2021-10-06T19:01:05+0000 lvl=info msg=" - u 0 1000000 1000000000" 
t=2021-10-06T19:01:05+0000 lvl=info msg=" - g 0 1000000 1000000000" 
t=2021-10-06T19:01:05+0000 lvl=info msg="Kernel features:" 
t=2021-10-06T19:01:05+0000 lvl=info msg=" - closing multiple file descriptors efficiently: no" 
t=2021-10-06T19:01:05+0000 lvl=info msg=" - netnsid-based network retrieval: yes" 
t=2021-10-06T19:01:05+0000 lvl=info msg=" - pidfds: yes" 
t=2021-10-06T19:01:05+0000 lvl=info msg=" - uevent injection: yes" 
t=2021-10-06T19:01:05+0000 lvl=info msg=" - seccomp listener: yes" 
t=2021-10-06T19:01:05+0000 lvl=info msg=" - seccomp listener continue syscalls: yes" 
t=2021-10-06T19:01:05+0000 lvl=info msg=" - seccomp listener add file descriptors: no" 
t=2021-10-06T19:01:05+0000 lvl=info msg=" - attach to namespaces via pidfds: no" 
t=2021-10-06T19:01:05+0000 lvl=info msg=" - safe native terminal allocation : yes" 
t=2021-10-06T19:01:05+0000 lvl=info msg=" - unprivileged file capabilities: yes" 
t=2021-10-06T19:01:05+0000 lvl=info msg=" - cgroup layout: hybrid" 
t=2021-10-06T19:01:05+0000 lvl=warn msg=" - Couldn't find the CGroup blkio.weight, disk priority will be ignored" 
t=2021-10-06T19:01:05+0000 lvl=warn msg=" - Couldn't find the CGroup memory swap accounting, swap limits will be ignored" 
t=2021-10-06T19:01:05+0000 lvl=info msg=" - shiftfs support: disabled" 
t=2021-10-06T19:01:05+0000 lvl=warn msg="Instance type not operational" driver=qemu err="KVM support is missing" type=virtual-machine
t=2021-10-06T19:01:05+0000 lvl=info msg="Initializing local database" 
t=2021-10-06T19:01:05+0000 lvl=info msg="Set client certificate to server certificate 20109e488a8a3f06de230880d2b851d1d2bb314d995ec826e9ce96a3d3755d35" 
t=2021-10-06T19:01:06+0000 lvl=info msg="Starting /dev/lxd handler:" 
t=2021-10-06T19:01:06+0000 lvl=info msg=" - binding devlxd socket" socket=/var/snap/lxd/common/lxd/devlxd/sock
t=2021-10-06T19:01:06+0000 lvl=info msg="REST API daemon:" 
t=2021-10-06T19:01:06+0000 lvl=info msg=" - binding Unix socket" inherited=true socket=/var/snap/lxd/common/lxd/unix.socket
t=2021-10-06T19:01:06+0000 lvl=info msg=" - binding TCP socket" socket=[::]:8443
t=2021-10-06T19:01:06+0000 lvl=info msg="Initializing global database" 
t=2021-10-06T19:01:06+0000 lvl=info msg="Firewall loaded driver \"xtables\"" 
t=2021-10-06T19:01:06+0000 lvl=info msg="Initializing storage pools" 
t=2021-10-06T19:01:08+0000 lvl=info msg="Initializing daemon storage mounts" 
t=2021-10-06T19:01:08+0000 lvl=info msg="Initializing networks" 
t=2021-10-06T19:01:09+0000 lvl=info msg="Pruning leftover image files" 
t=2021-10-06T19:01:09+0000 lvl=info msg="Done pruning leftover image files" 
t=2021-10-06T19:01:09+0000 lvl=info msg="Loading daemon configuration" 
t=2021-10-06T19:01:09+0000 lvl=info msg="Started seccomp handler" path=/var/snap/lxd/common/lxd/seccomp.socket
t=2021-10-06T19:01:09+0000 lvl=info msg="Pruning expired images" 
t=2021-10-06T19:01:09+0000 lvl=info msg="Done pruning expired images" 
t=2021-10-06T19:01:09+0000 lvl=info msg="Pruning expired instance backups" 
t=2021-10-06T19:01:09+0000 lvl=info msg="Done pruning expired instance backups" 
t=2021-10-06T19:01:09+0000 lvl=info msg="Updating images" 
t=2021-10-06T19:01:09+0000 lvl=info msg="Done updating images" 
t=2021-10-06T19:01:09+0000 lvl=info msg="Expiring log files" 
t=2021-10-06T19:01:09+0000 lvl=info msg="Done expiring log files" 
t=2021-10-06T19:01:09+0000 lvl=info msg="Updating instance types" 
t=2021-10-06T19:01:09+0000 lvl=info msg="Done updating instance types" 
t=2021-10-06T19:01:10+0000 lvl=info msg="Starting container" action=start created=2021-09-21T17:54:35+0000 ephemeral=false instance=container-h9mb-rtpo-nsdr instanceType=container project=default stateful=false used=2021-10-06T18:56:58+0000
t=2021-10-06T19:01:10+0000 lvl=info msg="Started container" action=start created=2021-09-21T17:54:35+0000 ephemeral=false instance=container-h9mb-rtpo-nsdr instanceType=container project=default stateful=false used=2021-10-06T18:56:58+0000
t=2021-10-06T19:04:09+0000 lvl=info msg="Starting container" action=start created=2021-10-01T16:02:07+0000 ephemeral=false instance=container-a8c0-es25-jpck instanceType=container project=default stateful=false used=2021-10-06T18:57:34+0000
t=2021-10-06T19:04:13+0000 lvl=info msg="Started container" action=start created=2021-10-01T16:02:07+0000 ephemeral=false instance=container-a8c0-es25-jpck instanceType=container project=default stateful=false used=2021-10-06T18:57:34+0000
t=2021-10-06T19:04:28+0000 lvl=info msg="Stopping container" action=stop created=2021-10-01T16:02:07+0000 ephemeral=false instance=container-a8c0-es25-jpck instanceType=container project=default stateful=false used=2021-10-06T19:04:09+0000
t=2021-10-06T19:04:31+0000 lvl=info msg="Stopped container" action=stop created=2021-10-01T16:02:07+0000 ephemeral=false instance=container-a8c0-es25-jpck instanceType=container project=default stateful=false used=2021-10-06T19:04:09+0000
t=2021-10-06T19:18:44+0000 lvl=info msg="Shutting down container" action=shutdown created=2021-09-21T17:54:35+0000 ephemeral=false instance=container-h9mb-rtpo-nsdr instanceType=container project=default timeout=-1s used=2021-10-06T19:01:10+0000
t=2021-10-06T19:20:42+0000 lvl=info msg="Starting container" action=start created=2021-10-01T16:02:07+0000 ephemeral=false instance=container-a8c0-es25-jpck instanceType=container project=default stateful=false used=2021-10-06T19:04:09+0000
t=2021-10-06T19:20:49+0000 lvl=info msg="Started container" action=start created=2021-10-01T16:02:07+0000 ephemeral=false instance=container-a8c0-es25-jpck instanceType=container project=default stateful=false used=2021-10-06T19:04:09+0000
t=2021-10-06T19:44:19+0000 lvl=info msg="Restarting container" action=shutdown created=2021-10-01T16:02:07+0000 ephemeral=false instance=container-a8c0-es25-jpck instanceType=container project=default timeout=-1ns used=2021-10-06T19:20:43+0000
t=2021-10-06T19:46:57+0000 lvl=info msg="Stopping container" action=stop created=2021-10-01T16:02:07+0000 ephemeral=false instance=container-a8c0-es25-jpck instanceType=container project=default stateful=false used=2021-10-06T19:20:43+0000
t=2021-10-06T19:47:00+0000 lvl=info msg="Stopped container" action=stop created=2021-10-01T16:02:07+0000 ephemeral=false instance=container-a8c0-es25-jpck instanceType=container project=default stateful=false used=2021-10-06T19:20:43+0000
t=2021-10-06T19:47:10+0000 lvl=info msg="Starting container" action=start created=2021-10-01T16:02:07+0000 ephemeral=false instance=container-a8c0-es25-jpck instanceType=container project=default stateful=false used=2021-10-06T19:20:43+0000
t=2021-10-06T19:47:16+0000 lvl=info msg="Started container" action=start created=2021-10-01T16:02:07+0000 ephemeral=false instance=container-a8c0-es25-jpck instanceType=container project=default stateful=false used=2021-10-06T19:20:43+0000
t=2021-10-06T19:51:53+0000 lvl=eror msg="Failed to update the image" err="Failed to create image \"b858d693a55fcf9b1a656fd13d577104fd1dc6f2f1481f5ca35a08ec25d28a6c\" on storage pool \"z_container_pool\": Unable to unpack image, run out of disk space" fingerprint=56296ba81a6fb502c634697a840d7957c3d2aa1a1805820e605ed21475058851
t=2021-10-06T19:51:53+0000 lvl=info msg="Creating container" ephemeral=false instance=s instanceType=container project=default
t=2021-10-06T19:51:53+0000 lvl=info msg="Created container" ephemeral=false instance=s instanceType=container project=default
t=2021-10-06T19:53:10+0000 lvl=info msg="Shut down container" action=shutdown created=2021-09-21T17:54:35+0000 ephemeral=false instance=container-h9mb-rtpo-nsdr instanceType=container project=default timeout=-1s used=2021-10-06T19:01:10+0000
t=2021-10-06T19:55:06+0000 lvl=eror msg="Error getting disk usage" err="Failed to run: zfs get -H -p -o value used z_container_pool/containers/s: cannot open 'z_container_pool/containers/s': dataset does not exist" instance=s instanceType=container project=default
t=2021-10-06T20:01:09+0000 lvl=info msg="Pruning expired instance backups" 
t=2021-10-06T20:01:09+0000 lvl=info msg="Done pruning expired instance backups" 

same container log of one of the affected containers

Name: container-a8c0-es25-jpck
Location: none
Remote: unix://
Architecture: x86_64
Created: 2021/10/01 16:02 UTC
Status: Running
Type: container
Profiles: cleanslate_profile
Pid: 114985
Ips:
  eth0: inet6   fd42:c50e:d848:412c:216:3eff:feca:9bf6  vethf7312a81
  eth0: inet6   fe80::216:3eff:feca:9bf6    vethf7312a81
  lo:   inet    127.0.0.1
  lo:   inet6   ::1
Resources:
  Processes: 25
  Disk usage:
    root: 1.15GB
  CPU usage:
    CPU usage (in seconds): 4
  Memory usage:
    Memory (current): 93.95MB
    Memory (peak): 105.33MB
  Network usage:
    eth0:
      Bytes received: 826B
      Bytes sent: 1.09kB
      Packets received: 7
      Packets sent: 13
    lo:
      Bytes received: 0B
      Bytes sent: 0B
      Packets received: 0
      Packets sent: 0

Log:

lxc container-a8c0-es25-jpck 20211006194710.553 WARN     conf - conf.c:lxc_map_ids:3389 - newuidmap binary is missing
lxc container-a8c0-es25-jpck 20211006194710.553 WARN     conf - conf.c:lxc_map_ids:3395 - newgidmap binary is missing
lxc container-a8c0-es25-jpck 20211006194710.557 WARN     conf - conf.c:lxc_map_ids:3389 - newuidmap binary is missing
lxc container-a8c0-es25-jpck 20211006194710.557 WARN     conf - conf.c:lxc_map_ids:3395 - newgidmap binary is missing
lxc container-a8c0-es25-jpck 20211006194710.558 WARN     cgfsng - cgroups/cgfsng.c:fchowmodat:1293 - No such file or directory - Failed to fchownat(43, memory.oom.group, 1000000000, 0, AT_EMPTY_PATH | AT_SYMLINK_NOFOLLOW )

expanded config of one of the affected containers

architecture: x86_64
config:
  image.architecture: amd64
  image.description: ubuntu 20.04 LTS amd64 (release) (20210825)
  image.label: release
  image.os: ubuntu
  image.release: focal
  image.serial: "20210825"
  image.type: squashfs
  image.version: "20.04"
  security.nesting: "false"
  security.privileged: "false"
  volatile.base_image: 3aa23c132adc8ba62983bef741935971ddaafe520a8f549967476f09b06fd840
  volatile.eth0.host_name: vethf7312a81
  volatile.eth0.hwaddr: 00:16:3e:ca:9b:f6
  volatile.eth0.name: eth0
  volatile.idmap.base: "0"
  volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.power: RUNNING
  volatile.uuid: abda6479-28b1-4dd0-a1a1-bacc08cc176b
devices:
  eth0:
    network: lxdbr0
    security.ipv4_filtering: "true"
    security.ipv6_filtering: "true"
    security.mac_filtering: "true"
    type: nic
    user.network_mode: dhcp
  root:
    path: /
    pool: z_container_pool
    size: 100GB
    type: disk
ephemeral: false
profiles:
- cleanslate_profile
stateful: false
description: ""
stgraber commented 3 years ago

Closing as it's almost certainly the usual issue with Docker on the host system firewalling off all traffic.

stgraber commented 3 years ago

If you have Docker running alongside LXD, it messes with iptables and blocks all traffic from other container and VM managers (well, all IPv4 traffic that is). To fix that, remove Docker or reconfigure it to not mess with the entire system's firewalling.

If there's no Docker on the host system, please provide:

Mohamedemad4 commented 3 years ago

So i disabled the docker snap and rebooted the instance. But the issue still persists The output of iptables -L -n -v is:

Chain INPUT (policy ACCEPT)
target     prot opt source               destination         
ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:domain /* generated for LXD network lxdbr0 */
ACCEPT     udp  --  anywhere             anywhere             udp dpt:domain /* generated for LXD network lxdbr0 */
ACCEPT     udp  --  anywhere             anywhere             udp dpt:bootps /* generated for LXD network lxdbr0 */
ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:openvpn
ACCEPT     all  --  anywhere             anywhere            

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         
ACCEPT     all  --  anywhere             anywhere             /* generated for LXD network lxdbr0 */
ACCEPT     all  --  anywhere             anywhere             /* generated for LXD network lxdbr0 */
ACCEPT     all  --  anywhere             anywhere            
ACCEPT     all  --  anywhere             anywhere            

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         
ACCEPT     tcp  --  anywhere             anywhere             tcp spt:domain /* generated for LXD network lxdbr0 */
ACCEPT     udp  --  anywhere             anywhere             udp spt:domain /* generated for LXD network lxdbr0 */
ACCEPT     udp  --  anywhere             anywhere             udp spt:bootps /* generated for LXD network lxdbr0 */

The output of ip6tables -L -n -v is:


Chain INPUT (policy ACCEPT 2 packets, 184 bytes)
 pkts bytes target     prot opt in     out     source               destination         
    8   808 lxd_nic_lxdbr0  all      lxdbr0 *       ::/0                 ::/0                 /* generated for LXD network lxdbr0 */
    0     0 ACCEPT     tcp      lxdbr0 *       ::/0                 ::/0                 tcp dpt:53 /* generated for LXD network lxdbr0 */
    0     0 ACCEPT     udp      lxdbr0 *       ::/0                 ::/0                 udp dpt:53 /* generated for LXD network lxdbr0 */
    0     0 ACCEPT     udp      lxdbr0 *       ::/0                 ::/0                 udp dpt:547 /* generated for LXD network lxdbr0 */

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 lxd_nic_lxdbr0  all      lxdbr0 *       ::/0                 ::/0                 /* generated for LXD network lxdbr0 */
    0     0 ACCEPT     all      *      lxdbr0  ::/0                 ::/0                 /* generated for LXD network lxdbr0 */
    0     0 ACCEPT     all      lxdbr0 *       ::/0                 ::/0                 /* generated for LXD network lxdbr0 */

Chain OUTPUT (policy ACCEPT 18 packets, 1832 bytes)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 ACCEPT     tcp      *      lxdbr0  ::/0                 ::/0                 tcp spt:53 /* generated for LXD network lxdbr0 */
    0     0 ACCEPT     udp      *      lxdbr0  ::/0                 ::/0                 udp spt:53 /* generated for LXD network lxdbr0 */
    0     0 ACCEPT     udp      *      lxdbr0  ::/0                 ::/0                 udp spt:547 /* generated for LXD network lxdbr0 */

Chain lxd_nic_lxdbr0 (2 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 DROP       icmpv6    lxdbr0 *       ::/0                 ::/0                 PHYSDEV match --physdev-in vethf6d4fc60 ipv6-icmptype 136 STRING match ! "|00163eca9bf6|" ALGO name bm FROM 66 TO 72 /* generated for LXD container container-a8c0-es25-jpck (eth0) */
    0     0 DROP       icmpv6    lxdbr0 *       ::/0                 ::/0                 PHYSDEV match --physdev-in vethf6d4fc60 ipv6-icmptype 136 STRING match ! "|fd42c50ed848412c02163efffeca9bf6|" ALGO name bm FROM 48 TO 64 /* generated for LXD container container-a8c0-es25-jpck (eth0) */

Also worth mentioning that i had to manually load the br_netfilter kernel module before i could start any of the containers. because i was getting this error:

Error: Failed preparing container for start: Failed to start device "eth0": security.ipv6_filtering requires br_netfilter be loaded: open /proc/sys/net/bridge/bridge-nf-call-ip6tables: no such file or directory

Which makes me assume that docker probably was loading it for me before. @stgraber

stgraber commented 3 years ago

Can you show ebtables -Lv too?

@tomponline any ideas? Firewall looks reasonable now.

Mohamedemad4 commented 3 years ago

-> ebtables -Lv


Bridge table: filter

Bridge chain: INPUT, entries: 0, policy: ACCEPT

Bridge chain: FORWARD, entries: 0, policy: ACCEPT

Bridge chain: OUTPUT, entries: 0, policy: ACCEPT

@stgraber

tomponline commented 3 years ago

@Mohamedemad4 please show output of ip a and ip r on both the host and the container.

Also please provide output of ps aux | grep dnsmasq and sudo ss- ulpn on the host.

Mohamedemad4 commented 3 years ago

host

-> ps aux | grep dnsmasq

lxd         2414  0.0  0.0   8928  3912 ?        Ss   Oct06   0:00 dnsmasq --keep-in-foreground --strict-order --bind-interfaces --except-interface=lo --pid-file= --no-ping --interface=lxdbr0 --dhcp-rapid-commit --quiet-dhcp --quiet-dhcp6 --quiet-ra --listen-address=10.222.79.1 --dhcp-no-override --dhcp-authoritative --dhcp-leasefile=/var/snap/lxd/common/lxd/networks/lxdbr0/dnsmasq.leases --dhcp-hostsfile=/var/snap/lxd/common/lxd/networks/lxdbr0/dnsmasq.hosts --dhcp-range 10.222.79.2,10.222.79.254,1h --listen-address=fd42:c50e:d848:412c::1 --enable-ra --dhcp-range ::,constructor:lxdbr0,ra-stateless,ra-names -s lxd --interface-name _gateway.lxd,lxdbr0 -S /lxd/ --conf-file=/var/snap/lxd/common/lxd/networks/lxdbr0/dnsmasq.raw -u lxd -g lxd
root      250820  0.0  0.0   6404   736 pts/0    S+   11:55   0:00 grep --color=auto --exclude-dir=.bzr --exclude-dir=CVS --exclude-dir=.git --exclude-dir=.hg --exclude-dir=.svn --exclude-dir=.idea --exclude-dir=.tox dnsmasq

-> ss -ulpn

State        Recv-Q       Send-Q                        Local Address:Port              Peer Address:Port      Process                                         
UNCONN       0            0                                   0.0.0.0:42400                  0.0.0.0:*          users:(("rsyslogd",pid=1058,fd=7))             
UNCONN       0            0                               10.222.79.1:53                     0.0.0.0:*          users:(("dnsmasq",pid=2414,fd=8))              
UNCONN       0            0                             127.0.0.53%lo:53                     0.0.0.0:*          users:(("systemd-resolve",pid=972,fd=12))      
UNCONN       0            0                            0.0.0.0%lxdbr0:67                     0.0.0.0:*          users:(("dnsmasq",pid=2414,fd=4))              
UNCONN       0            0                           10.172.0.6%ens4:68                     0.0.0.0:*          users:(("systemd-network",pid=969,fd=19))      
UNCONN       0            0                                 127.0.0.1:323                    0.0.0.0:*          users:(("chronyd",pid=1037,fd=5))              
UNCONN       0            0                  [fd42:c50e:d848:412c::1]:53                        [::]:*          users:(("dnsmasq",pid=2414,fd=10))             
UNCONN       0            0                                     [::1]:323                       [::]:*          users:(("chronyd",pid=1037,fd=6))              
UNCONN       0            0                               [::]%lxdbr0:547                       [::]:*          users:(("dnsmasq",pid=2414,fd=6)) 

-> ip a

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1460 qdisc mq state UP group default qlen 1000
    link/ether 42:01:0a:ac:00:06 brd ff:ff:ff:ff:ff:ff
    inet 10.172.0.6/32 scope global dynamic ens4
       valid_lft 2962sec preferred_lft 2962sec
    inet6 fe80::4001:aff:feac:6/64 scope link 
       valid_lft forever preferred_lft forever
4: tun0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UNKNOWN group default qlen 100
    link/none 
    inet 10.8.0.1/24 brd 10.8.0.255 scope global tun0
       valid_lft forever preferred_lft forever
    inet6 fe80::9f4:8d7f:ba5f:540c/64 scope link stable-privacy 
       valid_lft forever preferred_lft forever
5: lxdbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
    link/ether 00:16:3e:26:44:53 brd ff:ff:ff:ff:ff:ff
    inet 10.222.79.1/24 scope global lxdbr0
       valid_lft forever preferred_lft forever
    inet6 fd42:c50e:d848:412c::1/64 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::216:3eff:fe26:4453/64 scope link 
       valid_lft forever preferred_lft forever

-> ip r


default via 10.172.0.1 dev ens4 proto dhcp src 10.172.0.6 metric 100 
10.8.0.0/24 dev tun0 proto kernel scope link src 10.8.0.1 
10.172.0.1 dev ens4 proto dhcp scope link src 10.172.0.6 metric 100 
10.222.79.0/24 dev lxdbr0 proto kernel scope link src 10.222.79.1 linkdown 

container

-> ip a

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
16: eth0@if17: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 00:16:3e:ca:9b:f6 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet6 fd42:c50e:d848:412c:216:3eff:feca:9bf6/64 scope global dynamic mngtmpaddr 
       valid_lft 3599sec preferred_lft 3599sec
    inet6 fe80::216:3eff:feca:9bf6/64 scope link 
       valid_lft forever preferred_lft forever

-> ip r

[no output]

@tomponline

tomponline commented 3 years ago

Please can you show sudo nft list ruleset.

tomponline commented 3 years ago

Also can you confirm that running lxc exec <container> -- dhclient doesn't result in an IPv4 address being allocated?

Mohamedemad4 commented 3 years ago

Small update: One of the containers got an ipv4 address automatically (still measuring how long, but measure in minutes not milliseconds )

Here is the ip r output from it

default via 10.222.79.1 dev eth0 proto dhcp src 10.222.79.10 metric 100 
10.222.79.0/24 dev eth0 proto kernel scope link src 10.222.79.10 
10.222.79.1 dev eth0 proto dhcp scope link src 10.222.79.10 metric 100 

The output of nft list ruleset (btw i had to install nftables first.)

table inet filter {
    chain input {
        type filter hook input priority filter; policy accept;
    }

    chain forward {
        type filter hook forward priority filter; policy accept;
    }

    chain output {
        type filter hook output priority filter; policy accept;
    }
}

lxc exec <container> -- dhclient

lxc exec <container> -- dhclient actually results in an ipv4 being allocated but i noticed the whole exec flow is sluggish According to lxc monitor it hangs at Started mirroring websocket for approx 2 minutes before the flow continues and the command is executed.

This is true for all commands running on this instance (even after the container gets an ipv4)

@tomponline

tomponline commented 3 years ago

OK so there is something wrong inside your container, its not requesting a DHCP allocation.

I would double check what is running inside the container and ensure the network configure is correct, also check journalctl inside the container for any errors.

Mohamedemad4 commented 3 years ago

So upon checking the logs I have discovered the issue to be that the main ZFS pool was out of space.😅 Closing.