canonical / lxd

Powerful system container and virtual machine manager
https://canonical.com/lxd
GNU Affero General Public License v3.0
4.34k stars 931 forks source link

lxd hangs after reboot, need to restart manually #1811

Closed wsw70 closed 8 years ago

wsw70 commented 8 years ago

Required information

There are two containers bridged to my own bridges (br1 and br2). lxdbr0 is no tused (though it appears in one log, see bottom of this report)

root@srv:~# lxc info
apicompat: 0
auth: trusted
environment:
  addresses: []
  architectures:
  - x86_64
  - i686
  certificate: |
    -----BEGIN CERTIFICATE-----
    MIIFvDCCA6SgAwIBAgIQO+DuG065MVSD8xO/Lg3eeTANBgkqhkiG9w0BAQsFADAx
    (...)
    C2o8N+CjpYzO3MiO3wviLra/O7zBpayfBSylBn93BNk=
    -----END CERTIFICATE-----
  driver: lxc
  driverversion: 2.0.0.rc13
  kernel: Linux
  kernelarchitecture: x86_64
  kernelversion: 4.2.0-34-generic
  server: lxd
  serverpid: 15670
  serverversion: 2.0.0.rc6
  storage: dir
  storageversion: ""
config: {}
public: false

Issue description

After an update of lxd and lxc yesterday (from the standard Ubuntu repository) lxd fails to start. It needs to be manually restarted via service lxd stop ; service lxd stop - everything comes back to normal after this step. Before the update the exact same configuration was working perfectly.

Steps to reproduce

  1. (update from the repositories to thecurrent version)
  2. reboot
  3. lxd does not work, no containers are mounted, syslog info attached below
  4. # service lxd stop ; service lxd stop
  5. lxd is functional again, all containers are automatically mounted

    Information to attach

    • [ ] any relevant kernel output (dmesg)

dmesg boot portion:

[   18.608640] audit: type=1400 audit(1459068120.720:12): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="/usr/bin
/lxc-start" pid=1390 comm="apparmor_parser"
[   18.620382] audit: type=1400 audit(1459068120.732:13): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="lxc-cont
ainer-default" pid=1397 comm="apparmor_parser"
[   18.620387] audit: type=1400 audit(1459068120.732:14): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="lxc-cont
ainer-default-cgns" pid=1397 comm="apparmor_parser"
[   18.620390] audit: type=1400 audit(1459068120.732:15): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="lxc-cont
ainer-default-with-mounting" pid=1397 comm="apparmor_parser"
[   18.620392] audit: type=1400 audit(1459068120.732:16): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="lxc-cont
ainer-default-with-nesting" pid=1397 comm="apparmor_parser"
[   18.667053] systemd[1]: lxd-bridge.service: Failed with result 'start-limit'.

dmesg portion upon manual restart of lxd

[  331.215308] audit: type=1400 audit(1459068433.716:17): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="/usr/bin
/lxc-start" pid=10683 comm="apparmor_parser"
[  331.218891] audit: type=1400 audit(1459068433.720:18): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="lxc-cont
ainer-default" pid=10687 comm="apparmor_parser"
[  331.218897] audit: type=1400 audit(1459068433.720:19): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="lxc-cont
ainer-default-cgns" pid=10687 comm="apparmor_parser"
[  331.218900] audit: type=1400 audit(1459068433.720:20): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="lxc-cont
ainer-default-with-mounting" pid=10687 comm="apparmor_parser"
[  331.218903] audit: type=1400 audit(1459068433.720:21): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="lxc-cont
ainer-default-with-nesting" pid=10687 comm="apparmor_parser"
[  353.485863] audit: type=1400 audit(1459068455.960:22): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="/usr/bin
/lxc-start" pid=11107 comm="apparmor_parser"
[  353.489531] audit: type=1400 audit(1459068455.964:23): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="lxc-cont
ainer-default" pid=11111 comm="apparmor_parser"
[  353.489536] audit: type=1400 audit(1459068455.964:24): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="lxc-cont
ainer-default-cgns" pid=11111 comm="apparmor_parser"
[  353.489539] audit: type=1400 audit(1459068455.964:25): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="lxc-cont
ainer-default-with-mounting" pid=11111 comm="apparmor_parser"
[  353.489542] audit: type=1400 audit(1459068455.964:26): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="lxc-cont
ainer-default-with-nesting" pid=11111 comm="apparmor_parser"
[  353.530775] audit: type=1400 audit(1459068456.008:27): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxd-minecra
ft_</var/lib/lxd>" pid=11152 comm="apparmor_parser"
[  353.540898] device vethC1GSGX entered promiscuous mode
[  353.540941] br2: port 1(vethC1GSGX) entered forwarding state
[  353.540945] br2: port 1(vethC1GSGX) entered forwarding state
[  353.542180] br2: port 1(vethC1GSGX) entered disabled state
[  353.603038] eth0: renamed from vethWWXDHT
[  353.662772] br2: port 1(vethC1GSGX) entered forwarding state
[  353.662777] br2: port 1(vethC1GSGX) entered forwarding state
[  354.096626] audit: type=1400 audit(1459068456.572:28): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxd-vpn_</v
ar/lib/lxd>" pid=11424 comm="apparmor_parser"
[  354.104871] device veth5SAO4W entered promiscuous mode
[  354.104913] br1: port 1(veth5SAO4W) entered forwarding state
[  354.104917] br1: port 1(veth5SAO4W) entered forwarding state
[  354.178211] eth0: renamed from veth7YKPW6

The containers start fine after lxd restart. I can attach the log if neded.

t=2016-03-27T10:08:23+0200 lvl=info msg="LXD is starting in normal mode" path=/var/lib/lxd
t=2016-03-27T10:08:23+0200 lvl=warn msg="Couldn't find the CGroup pids controller, process limits will be ignored."
t=2016-03-27T10:08:23+0200 lvl=warn msg="CGroup memory swap accounting is disabled, swap limits will be ignored."
t=2016-03-27T10:08:23+0200 lvl=info msg="Default uid/gid map:"
t=2016-03-27T10:08:23+0200 lvl=info msg=" - u 0 165536 65536"
t=2016-03-27T10:08:23+0200 lvl=info msg=" - g 0 165536 65536"
t=2016-03-27T10:08:23+0200 lvl=info msg=Init driver=storage/dir
t=2016-03-27T10:08:23+0200 lvl=info msg="Looking for existing certificates" cert=/var/lib/lxd/server.crt key=/var/lib/lxd/server.key
t=2016-03-27T10:08:23+0200 lvl=info msg="LXD is socket activated"
t=2016-03-27T10:08:23+0200 lvl=info msg="REST API daemon:"
t=2016-03-27T10:08:23+0200 lvl=info msg=" - binding Unix socket" socket=/var/lib/lxd/unix.socket
t=2016-03-27T10:08:23+0200 lvl=info msg=handling ip=@ method=GET url=/1.0
t=2016-03-27T10:08:23+0200 lvl=info msg=handling method=GET url=/1.0 ip=@
t=2016-03-27T10:08:23+0200 lvl=info msg=handling url=/1.0 ip=@ method=GET
t=2016-03-27T10:08:23+0200 lvl=info msg=handling method=GET url=/internal/ready ip=@
t=2016-03-27T10:08:23+0200 lvl=info msg=handling method=GET url=/1.0 ip=@
t=2016-03-27T10:08:23+0200 lvl=info msg=handling method=GET url=/internal/containers/2/onstart ip=@
t=2016-03-27T10:08:23+0200 lvl=info msg=handling method=GET url=/1.0 ip=@
t=2016-03-27T10:08:23+0200 lvl=info msg=handling method=GET url=/internal/containers/1/onstart ip=@
t=2016-03-27T10:08:48+0200 lvl=info msg="Received 'terminated signal', exiting."
t=2016-03-27T10:08:48+0200 lvl=info msg="Stopping REST API handler:"
t=2016-03-27T10:08:48+0200 lvl=info msg=" - skipping socket-activated socket" socket=/var/lib/lxd/unix.socket
t=2016-03-27T10:26:38+0200 lvl=info msg="LXD is starting in normal mode" path=/var/lib/lxd
t=2016-03-27T10:26:38+0200 lvl=warn msg="Couldn't find the CGroup pids controller, process limits will be ignored."
t=2016-03-27T10:26:38+0200 lvl=warn msg="CGroup memory swap accounting is disabled, swap limits will be ignored."
t=2016-03-27T10:26:38+0200 lvl=info msg="Default uid/gid map:"
t=2016-03-27T10:26:38+0200 lvl=info msg=" - u 0 165536 65536"
t=2016-03-27T10:26:38+0200 lvl=info msg=" - g 0 165536 65536"
t=2016-03-27T10:26:38+0200 lvl=info msg=Init driver=storage/dir
t=2016-03-27T10:26:38+0200 lvl=info msg="Looking for existing certificates" cert=/var/lib/lxd/server.crt key=/var/lib/lxd/server.key
t=2016-03-27T10:26:38+0200 lvl=info msg="LXD is socket activated"
t=2016-03-27T10:26:38+0200 lvl=info msg="REST API daemon:"
t=2016-03-27T10:26:38+0200 lvl=info msg=" - binding Unix socket" socket=/var/lib/lxd/unix.socket
t=2016-03-27T10:26:38+0200 lvl=info msg=handling method=GET url=/1.0 ip=@
t=2016-03-27T10:26:38+0200 lvl=info msg=handling method=GET url=/internal/ready ip=@
t=2016-03-27T10:26:38+0200 lvl=info msg=handling method=GET url=/1.0 ip=@
t=2016-03-27T10:26:38+0200 lvl=info msg=handling method=GET url=/internal/containers/2/onstart ip=@
t=2016-03-27T10:26:39+0200 lvl=info msg=handling method=GET url=/1.0 ip=@
t=2016-03-27T10:26:39+0200 lvl=info msg=handling method=GET url=/internal/containers/1/onstart ip=@
t=2016-03-27T10:27:41+0200 lvl=info msg=handling method=GET url=/1.0 ip=@
t=2016-03-27T10:27:41+0200 lvl=info msg=handling method=GET url="/1.0/containers?recursion=1" ip=@
t=2016-03-27T10:27:41+0200 lvl=info msg=handling url="/1.0/containers/minecraft/snapshots?recursion=1" ip=@ method=GET
t=2016-03-27T10:27:41+0200 lvl=info msg=handling method=GET url=/1.0/containers/vpn/state ip=@
t=2016-03-27T10:27:41+0200 lvl=info msg=handling method=GET url="/1.0/containers/vpn/snapshots?recursion=1" ip=@
t=2016-03-27T10:27:41+0200 lvl=info msg=handling method=GET url=/1.0/containers/minecraft/state ip=@
t=2016-03-27T10:28:10+0200 lvl=info msg=handling url=/1.0 ip=@ method=GET
t=2016-03-27T10:28:10+0200 lvl=info msg=handling method=POST url=/1.0/containers/vpn/exec ip=@
t=2016-03-27T10:28:10+0200 lvl=info msg=handling url="/1.0/operations/8e6bc2e9-ca49-405a-aac4-74d30c6abfec/websocket?secret=bf6de7f20950658
6082174fe718abd9d32b966fb9d5d9c601cff8de642b7e8cf" ip=@ method=GET
t=2016-03-27T10:28:10+0200 lvl=info msg=handling ip=@ method=GET url="/1.0/operations/8e6bc2e9-ca49-405a-aac4-74d30c6abfec/websocket?secret
=f8d947d71fb4b74dbc663e9ed7b46f5f0c76bfb579cd00c0e252f5f4e7ed2bf6"
t=2016-03-27T10:33:00+0200 lvl=info msg=handling method=GET url=/1.0/operations/8e6bc2e9-ca49-405a-aac4-74d30c6abfec/wait ip=@
t=2016-03-27T10:34:14+0200 lvl=info msg="Received 'terminated signal', exiting."
t=2016-03-27T10:34:14+0200 lvl=info msg="Stopping REST API handler:"
t=2016-03-27T10:34:14+0200 lvl=info msg=" - skipping socket-activated socket" socket=/var/lib/lxd/unix.socket
t=2016-03-27T10:47:35+0200 lvl=info msg="LXD is starting in normal mode" path=/var/lib/lxd
t=2016-03-27T10:47:35+0200 lvl=warn msg="Couldn't find the CGroup pids controller, process limits will be ignored."
t=2016-03-27T10:47:35+0200 lvl=warn msg="CGroup memory swap accounting is disabled, swap limits will be ignored."
t=2016-03-27T10:47:35+0200 lvl=info msg="Default uid/gid map:"
t=2016-03-27T10:47:35+0200 lvl=info msg=" - u 0 165536 65536"
t=2016-03-27T10:47:35+0200 lvl=info msg=" - g 0 165536 65536"
t=2016-03-27T10:47:35+0200 lvl=info msg=Init driver=storage/dir
t=2016-03-27T10:47:35+0200 lvl=info msg="Looking for existing certificates" cert=/var/lib/lxd/server.crt key=/var/lib/lxd/server.key
t=2016-03-27T10:47:35+0200 lvl=info msg="LXD is socket activated"
t=2016-03-27T10:47:35+0200 lvl=info msg="REST API daemon:"
t=2016-03-27T10:47:35+0200 lvl=info msg=" - binding Unix socket" socket=/var/lib/lxd/unix.socket
t=2016-03-27T10:47:35+0200 lvl=info msg=handling method=GET url=/1.0 ip=@
t=2016-03-27T10:47:35+0200 lvl=info msg=handling method=GET url=/internal/ready ip=@
t=2016-03-27T10:47:36+0200 lvl=info msg=handling ip=@ method=GET url=/1.0
t=2016-03-27T10:47:36+0200 lvl=info msg=handling method=GET url=/internal/containers/2/onstart ip=@
t=2016-03-27T10:47:36+0200 lvl=info msg=handling method=GET url=/1.0 ip=@
t=2016-03-27T10:47:36+0200 lvl=info msg=handling method=GET url=/internal/containers/1/onstart ip=@
t=2016-03-27T10:47:37+0200 lvl=info msg=handling url=/1.0 ip=@ method=GET
t=2016-03-27T10:47:37+0200 lvl=info msg=handling method=GET url="/1.0/containers?recursion=1" ip=@
t=2016-03-27T10:47:37+0200 lvl=info msg=handling url=/1.0/containers/minecraft/state ip=@ method=GET
t=2016-03-27T10:47:37+0200 lvl=info msg=handling method=GET url="/1.0/containers/vpn/snapshots?recursion=1" ip=@
t=2016-03-27T10:47:37+0200 lvl=info msg=handling method=GET url=/1.0/containers/vpn/state ip=@
t=2016-03-27T10:47:37+0200 lvl=info msg=handling ip=@ method=GET url="/1.0/containers/minecraft/snapshots?recursion=1"
t=2016-03-27T10:47:39+0200 lvl=info msg=handling method=GET url=/1.0 ip=@
t=2016-03-27T10:47:39+0200 lvl=info msg=handling url="/1.0/containers?recursion=1" ip=@ method=GET
t=2016-03-27T10:47:39+0200 lvl=info msg=handling method=GET url="/1.0/containers/minecraft/snapshots?recursion=1" ip=@
t=2016-03-27T10:47:39+0200 lvl=info msg=handling method=GET url="/1.0/containers/vpn/snapshots?recursion=1" ip=@
t=2016-03-27T10:47:39+0200 lvl=info msg=handling ip=@ method=GET url=/1.0/containers/vpn/state
t=2016-03-27T10:47:39+0200 lvl=info msg=handling method=GET url=/1.0/containers/minecraft/state ip=@
t=2016-03-27T10:47:41+0200 lvl=info msg=handling method=GET url=/1.0 ip=@
t=2016-03-27T10:47:41+0200 lvl=info msg=handling method=GET url="/1.0/containers?recursion=1" ip=@
t=2016-03-27T10:47:41+0200 lvl=info msg=handling url="/1.0/containers/minecraft/snapshots?recursion=1" ip=@ method=GET
t=2016-03-27T10:47:41+0200 lvl=info msg=handling method=GET url="/1.0/containers/vpn/snapshots?recursion=1" ip=@
t=2016-03-27T10:47:41+0200 lvl=info msg=handling method=GET url=/1.0/containers/minecraft/state ip=@
t=2016-03-27T10:47:41+0200 lvl=info msg=handling method=GET url=/1.0/containers/vpn/state ip=@
t=2016-03-27T10:48:17+0200 lvl=info msg="Received 'terminated signal', exiting."
t=2016-03-27T10:48:17+0200 lvl=info msg="Stopping REST API handler:"
t=2016-03-27T10:48:17+0200 lvl=info msg=" - skipping socket-activated socket" socket=/var/lib/lxd/unix.socket
t=2016-03-27T10:48:32+0200 lvl=info msg="LXD is starting in normal mode" path=/var/lib/lxd
t=2016-03-27T10:48:32+0200 lvl=warn msg="Couldn't find the CGroup pids controller, process limits will be ignored."
t=2016-03-27T10:48:32+0200 lvl=warn msg="CGroup memory swap accounting is disabled, swap limits will be ignored."
t=2016-03-27T10:48:32+0200 lvl=info msg="Default uid/gid map:"
t=2016-03-27T10:48:32+0200 lvl=info msg=" - u 0 165536 65536"
t=2016-03-27T10:48:32+0200 lvl=info msg=" - g 0 165536 65536"
t=2016-03-27T10:48:32+0200 lvl=info msg=Init driver=storage/dir
t=2016-03-27T10:48:32+0200 lvl=info msg="Looking for existing certificates" cert=/var/lib/lxd/server.crt key=/var/lib/lxd/server.key
t=2016-03-27T10:48:32+0200 lvl=info msg="LXD is socket activated"
t=2016-03-27T10:48:32+0200 lvl=info msg="REST API daemon:"
t=2016-03-27T10:48:32+0200 lvl=info msg=" - binding Unix socket" socket=/var/lib/lxd/unix.socket
t=2016-03-27T10:48:32+0200 lvl=info msg=handling url=/1.0 ip=@ method=GET
t=2016-03-27T10:48:32+0200 lvl=info msg=handling ip=@ method=GET url=/internal/ready
t=2016-03-27T10:48:56+0200 lvl=info msg=handling method=GET url=/1.0 ip=@
t=2016-03-27T10:48:56+0200 lvl=info msg=handling ip=@ method=GET url="/1.0/containers?recursion=1"
t=2016-03-27T10:48:56+0200 lvl=info msg=handling method=GET url="/1.0/containers/minecraft/snapshots?recursion=1" ip=@
t=2016-03-27T10:48:56+0200 lvl=info msg=handling method=GET url="/1.0/containers/vpn/snapshots?recursion=1" ip=@
t=2016-03-27T10:48:56+0200 lvl=info msg=handling method=GET url=/1.0/containers/vpn/state ip=@
t=2016-03-27T10:48:56+0200 lvl=info msg=handling url=/1.0/containers/minecraft/state ip=@ method=GET
t=2016-03-27T11:03:25+0200 lvl=info msg=handling method=GET url=/1.0 ip=@
t=2016-03-27T11:03:25+0200 lvl=info msg=handling url=/1.0 ip=@ method=GET
t=2016-03-27T11:12:47+0200 lvl=info msg=handling ip=@ method=GET url=/1.0
t=2016-03-27T11:12:47+0200 lvl=info msg=handling method=GET url=/1.0/containers/vpn ip=@
t=2016-03-27T11:12:47+0200 lvl=info msg=handling method=GET url=/1.0/containers/vpn/state ip=@
t=2016-03-27T11:12:47+0200 lvl=info msg=handling ip=@ method=GET url="/1.0/containers/vpn/snapshots?recursion=1"
t=2016-03-27T11:12:47+0200 lvl=info msg=handling ip=@ method=GET url=/1.0/containers/vpn/logs/lxc.log
t=2016-03-27T11:13:18+0200 lvl=info msg=handling ip=@ method=GET url=/1.0
t=2016-03-27T11:13:18+0200 lvl=info msg=handling method=GET url=/1.0/containers/vpn ip=@
t=2016-03-27T11:13:18+0200 lvl=info msg=handling ip=@ method=GET url=/1.0/containers/vpn/state
t=2016-03-27T11:13:18+0200 lvl=info msg=handling method=GET url="/1.0/containers/vpn/snapshots?recursion=1" ip=@
t=2016-03-27T11:13:18+0200 lvl=info msg=handling method=GET url=/1.0/containers/vpn/logs/lxc.log ip=@

Not sure how to do that on a running client / server?

After reboot, the log is full of lines as follows:

Mar 27 10:47:28 srv systemd[1]: lxd.service: Job lxd.service/start failed with result 'dependency'.
Mar 27 10:47:28 srv systemd[1]: lxd-bridge.service: Failed with result 'start-limit'.
Mar 27 10:47:28 srv systemd[1]: lxd-bridge.service: Start request repeated too quickly.
Mar 27 10:47:28 srv systemd[1]: Failed to start LXD - network bridge.
Mar 27 10:47:28 srv systemd[1]: Dependency failed for LXD - main daemon.
Mar 27 10:47:28 srv systemd[1]: lxd.service: Job lxd.service/start failed with result 'dependency'.
Mar 27 10:47:28 srv systemd[1]: lxd-bridge.service: Failed with result 'start-limit'.
Mar 27 10:47:28 srv systemd[1]: lxd-bridge.service: Start request repeated too quickly.

Upon restart of lxd

Mar 27 10:47:28 srv systemd[1]: Stopped LXD - main daemon.
Mar 27 10:47:35 srv systemd[1]: Starting LXD - unix socket.
Mar 27 10:47:35 srv systemd[1]: Listening on LXD - unix socket.
Mar 27 10:47:35 srv systemd[1]: Starting LXD - main daemon...
Mar 27 10:47:35 srv kernel: [  353.485863] audit: type=1400 audit(1459068455.960:22): apparmor="STATUS" operation="profile_replace" profile
="unconfined" name="/usr/bin/lxc-start" pid=11107 comm="apparmor_parser"
Mar 27 10:47:35 srv kernel: [  353.489531] audit: type=1400 audit(1459068455.964:23): apparmor="STATUS" operation="profile_replace" profile
="unconfined" name="lxc-container-default" pid=11111 comm="apparmor_parser"
Mar 27 10:47:35 srv kernel: [  353.489536] audit: type=1400 audit(1459068455.964:24): apparmor="STATUS" operation="profile_replace" profile
="unconfined" name="lxc-container-default-cgns" pid=11111 comm="apparmor_parser"
Mar 27 10:47:35 srv kernel: [  353.489539] audit: type=1400 audit(1459068455.964:25): apparmor="STATUS" operation="profile_replace" profile
="unconfined" name="lxc-container-default-with-mounting" pid=11111 comm="apparmor_parser"
Mar 27 10:47:35 srv kernel: [  353.489542] audit: type=1400 audit(1459068455.964:26): apparmor="STATUS" operation="profile_replace" profile
="unconfined" name="lxc-container-default-with-nesting" pid=11111 comm="apparmor_parser"
Mar 27 10:47:35 srv lxd[11113]: t=2016-03-27T10:47:35+0200 lvl=warn msg="Couldn't find the CGroup pids controller, process limits will be i
gnored."
Mar 27 10:47:35 srv lxd[11113]: t=2016-03-27T10:47:35+0200 lvl=warn msg="CGroup memory swap accounting is disabled, swap limits will be ign
ored."
Mar 27 10:47:35 srv systemd[1]: Started LXD - main daemon.
Mar 27 10:47:36 srv kernel: [  353.530775] audit: type=1400 audit(1459068456.008:27): apparmor="STATUS" operation="profile_load" profile="u
nconfined" name="lxd-minecraft_</var/lib/lxd>" pid=11152 comm="apparmor_parser"
Mar 27 10:47:36 srv kernel: [  353.540898] device vethC1GSGX entered promiscuous mode
Mar 27 10:47:36 srv kernel: [  353.540941] br2: port 1(vethC1GSGX) entered forwarding state
Mar 27 10:47:36 srv kernel: [  353.540945] br2: port 1(vethC1GSGX) entered forwarding state
Mar 27 10:47:36 srv systemd-udevd[11159]: Could not generate persistent MAC address for vethWWXDHT: No such file or directory
Mar 27 10:47:36 srv kernel: [  353.542180] br2: port 1(vethC1GSGX) entered disabled state
Mar 27 10:47:36 srv kernel: [  353.603038] eth0: renamed from vethWWXDHT
Mar 27 10:47:36 srv systemd[1]: proc-sys-fs-binfmt_misc.automount: Got automount request for /proc/sys/fs/binfmt_misc, triggered by 11162 (
lxd)
Mar 27 10:47:36 srv systemd[1]: Mounting Arbitrary Executable File Formats File System...
Mar 27 10:47:36 srv kernel: [  353.662772] br2: port 1(vethC1GSGX) entered forwarding state
Mar 27 10:47:36 srv kernel: [  353.662777] br2: port 1(vethC1GSGX) entered forwarding state
Mar 27 10:47:36 srv systemd[1]: Mounted Arbitrary Executable File Formats File System.
Mar 27 10:47:36 srv kernel: [  354.096626] audit: type=1400 audit(1459068456.572:28): apparmor="STATUS" operation="profile_load" profile="u
nconfined" name="lxd-vpn_</var/lib/lxd>" pid=11424 comm="apparmor_parser"
Mar 27 10:47:36 srv systemd-udevd[11428]: Could not generate persistent MAC address for veth7YKPW6: No such file or directory
Mar 27 10:47:36 srv kernel: [  354.104871] device veth5SAO4W entered promiscuous mode
Mar 27 10:47:36 srv kernel: [  354.104913] br1: port 1(veth5SAO4W) entered forwarding state
Mar 27 10:47:36 srv kernel: [  354.104917] br1: port 1(veth5SAO4W) entered forwarding state
Mar 27 10:47:36 srv kernel: [  354.178211] eth0: renamed from veth7YKPW6

journalctl -xe shows the following errors after boot:


Mar 27 10:47:17 srv systemd[1]: lxd.service: Job lxd.service/start failed with result 'dependency'.
Mar 27 10:47:17 srv systemd[1]: lxd.service: Job lxd.service/start failed with result 'dependency'.
Mar 27 10:47:17 srv systemd[1]: lxd.service: Job lxd.service/start failed with result 'dependency'.
Mar 27 10:47:20 srv lxd-bridge.start[10701]: RTNETLINK answers: Permission denied
Mar 27 10:47:20 srv lxd-bridge.start[10701]: Failed to setup lxd-bridge.
Mar 27 10:47:20 srv systemd-udevd[10709]: Could not generate persistent MAC address for lxdbr0: No such file or directory
Mar 27 10:47:20 srv lxd-bridge.start[10701]: iptables: Bad rule (does a matching rule exist in that chain?).
Mar 27 10:47:20 srv lxd-bridge.start[10701]: iptables: Bad rule (does a matching rule exist in that chain?).
Mar 27 10:47:20 srv lxd-bridge.start[10701]: iptables: Bad rule (does a matching rule exist in that chain?).
Mar 27 10:47:20 srv lxd-bridge.start[10701]: iptables: Bad rule (does a matching rule exist in that chain?).

I do not understand how lxdbr0 comes into the picture as it is disabled in /etc/default/lxc-net

stgraber commented 8 years ago

lxdbr0 is a new bridge that's being introduced, it has nothing to do with lxcbr0 and so doesn't get controlled by /etc/default/lxc-net.

That being said, the lxd-bridge init script should have started fine on your system, the "Permission denied" up there is a bit worrying. I wonder if it's because some kernel modules are failing to load dynamically or something on your system or something.

I'll try a clean Ubuntu 15.10 VM with the PPA later to see if I can reproduce it.

In the mean time, you can turn off lxdbr0 in /etc/default/lxd-bridge which should fix things for you.

antifuchs commented 8 years ago

I ran into a very similar problem, complete with the "RTNETLINK answers: Permission denied" error message. Some vigorous googling revealed that this might be due to IPv6. The top hit, an openstack post, recommends enabling ipv6 via sysctl's; for some reason, I had already made these sysctl.conf entries this before getting the error message:

net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

I removed these sysctl entries, and while I now no longer have working outside networking (that's why I disabled IPv6 in the first place), I now have a working lxdbr0 interface and lxc list doesn't hang anymore. I suppose that might have been the thing that triggered the hang for me.

wsw70 commented 8 years ago

@stgraber Thank you - editing /etc/default/lxd-bridge fixed the issue. I did not realize that lxdbr0 is a new bridge (the names are somehow similar :))

wsw70 commented 8 years ago

@antifuchs I also have IPv6 disabled (with the same commands as you do). The solution/workaround by @stgraber fixed the issue for me, without the need to re-enable IPv6.

antifuchs commented 8 years ago

Thanks for the confirmation, @wsw70! Sounds like we've found the environmental factor that causes this. I found an alternative that works for me: Just disable IPv6 on the physical network interfaces themselves, not on default, or all (or lo, no idea why I did that).

Either solution works on my machine & I can now use lxc commands again (-:

stgraber commented 8 years ago

@antifuchs oh, that's very interesting, I will do some tests with ipv6 disabled and adapt our scripts to cope with that.

wsw70 commented 8 years ago

@stgraber sorry, I may not have been clear enough: I had IPv6 disabled as well and did not need to modify this setting (= I keep IPv6 disabled) when applying your fix.

stgraber commented 8 years ago

@wsw70 yes, that's what I would expect. But what @antifuchs said suggests that if you wanted to actually use lxdbr0, that wouldn't work with IPv6 disabled, which is a bug.