Closed ledlamp closed 1 year ago
so to deal with it (as a work-around), there has to be an interface called lxdbr0, either leave the unneeded one it creates, or name your unmanaged interface lxdbr0 if you can. otherwise it'll create it and mess up your default profile
Confirmed issue. This is really odd.
I've confirmed that killing the lxd process after initializing it with the existing lxdbr1 interface and then triggering the process to be restarted by running lxc ls
doesn't create it. So I think we can rule this out as an actual LXD issue, but rather an external or packaging issue.
Additionally doing snap stop lxd
and then snap start lxd
doesn't trigger it either.
Further more its even easier to reproduce:
snap install lxd
lxc network ls # No lxdbr0
snap restart lxd
lxc network ls # Shows lxdbr0 managed network
So its something triggered from snap restart lxd.
I've also confirmed that we can actually see an API request coming into LXD upon snap restart lxd that inspects the existing networks and creates lxdbr0:
# Getting network list
Jun 29 06:58:13 vtest lxd.daemon[9910]: time="2023-06-29T06:58:13Z" level=debug msg="Handling API request" ip=@ method=GET protocol=unix url="/1.0/networks?recursion=1" username=root
Jun 29 06:58:13 vtest lxd.daemon[9910]: time="2023-06-29T06:58:13Z" level=debug msg="WriteJSON\n\t{\n\t\t\"type\": \"sync\",\n\t\t\"status\": \"Success\",\n\t\t\"status_code\": 200,\n\t\t\"operation\": \"\",\n\t\t\"error_code\": 0,\n\t\t\"error\": \"\",\n\t\t\"metadata\": [\n\t\t\t{\n\t\t\t\t\"config\": {},\n\t\t\t\t\"description\": \"\",\n\t\t\t\t\"name\": \"lo\",\n\t\t\t\t\"type\": \"loopback\",\n\t\t\t\t\"used_by\": [],\n\t\t\t\t\"managed\": false,\n\t\t\t\t\"status\": \"\",\n\t\t\t\t\"locations\": null\n\t\t\t},\n\t\t\t{\n\t\t\t\t\"config\": {},\n\t\t\t\t\"description\": \"\",\n\t\t\t\t\"name\": \"enp5s0\",\n\t\t\t\t\"type\": \"physical\",\n\t\t\t\t\"used_by\": null,\n\t\t\t\t\"managed\": false,\n\t\t\t\t\"status\": \"\",\n\t\t\t\t\"locations\": null\n\t\t\t},\n\t\t\t{\n\t\t\t\t\"config\": {},\n\t\t\t\t\"description\": \"\",\n\t\t\t\t\"name\": \"lxdbr1\",\n\t\t\t\t\"type\": \"bridge\",\n\t\t\t\t\"used_by\": null,\n\t\t\t\t\"managed\": false,\n\t\t\t\t\"status\": \"\",\n\t\t\t\t\"locations\": null\n\t\t\t}\n\t\t]\n\t}" http_code=200
# Create lxdbr0 request
Jun 29 06:58:13 vtest lxd.daemon[9910]: time="2023-06-29T06:58:13Z" level=debug msg="Handling API request" ip=@ method=POST protocol=unix url=/1.0/networks username=root
Jun 29 06:58:13 vtest lxd.daemon[9910]: time="2023-06-29T06:58:13Z" level=debug msg="API Request\n\t{\n\t\t\"config\": {},\n\t\t\"description\": \"\",\n\t\t\"name\": \"lxdbr0\",\n\t\t\"type\": \"bridge\"\n\t}" ip=@ method=POST protocol=unix url=/1.0/networks username=root
FWIW we dont tend to recommend using snap restart lxd
because it will stop any running instances, instead we tend to use:
sudo systemctl reload snap.lxd.daemon
Which just restarts the running LXD daemon and not the instances.
This doesn't appear to trigger the issue either.
@stgraber any ideas here, im at a bit of a loss. My only guess is that its something to do with either lxd-user or lxd-migrate (the lxd-migrate inside the snap that migrates from apt package) as that does create a bridge.
I also observe, but not sure if relevant this in the logs:
Jun 29 07:29:44 vtest audit[3869]: AVC apparmor="STATUS" operation="profile_load" profile="unconfined" name="snap.lxd.migrate" pid=3869 comm="apparmor_parser"
Jun 29 07:29:49 vtest audit[3966]: AVC apparmor="STATUS" operation="profile_replace" profile="unconfined" name="snap.lxd.migrate" pid=3966 comm="apparmor_parser"
Suggesting it may be being run.
There is something that is create a lxdbr0 bridge on snap restart lxd
if there are not managed networks exist.
It then goes on to add/replace an eth0 NIC device connected to that network to the default profile.
Speaking with @stgraber he confirmed this is a bug in the snap restart
command as it starts sub-units (like the lxd-user process) even if it wasn't previously running (because normally it starts by socket activation).
@gabrielmougard please can you open a bug with the snapd team for this https://bugs.launchpad.net/snapd/+filebug ?
Thanks
@ru-fu @gabrielmougard we should change the reference in the docs to snap restart --reload lxd
to snap restart --reload lxd.daemon
so we don't instruct users on discovering this external bug in snapd.
@ru-fu I don't see any mentions of snap restart --reload lxd
in our doc, but there is the Install LXD from a package
section here. Should we add a Restart a snap LXD deployment
sub-title below this section ?
I believe @ru-fu fixed it already
I fixed the snap restart
occurrences, yes.
It might be a good idea to add a section about how to restart LXD. But I'm not sure if the installing page is the best place for it ... Is there a common scenario where you need to restart LXD? Maybe after server config changes?
Required information
Issue description
LXD is creating an
lxdbr0
interface and reconfiguring the default profile if it does not exist on restart, even if it is not wanted, even if it was not asked for duringlxd init
.Steps to reproduce
snap install lxd
lxd init
but use an existing bridge (i.e. lxdbr1) instead of creating a new one:lxc network ls
andlxc profile show default
. There is nolxdbr0
and the profile is configured as desired.snap restart lxd
lxc network ls
andlxc profile show default
. A newlxdbr0
now exists and the default profile was changed to use it. The user is now very angry.Information to attach
dmesg
)lxc info NAME --show-log
)lxc config show NAME --expanded
)lxc monitor
while reproducing the issue)