canonical / cloud-init

Official upstream for the cloud-init: cloud instance initialization
https://cloud-init.io/
Other
2.88k stars 857 forks source link

error creating lxdbr0. #3195

Closed ubuntu-server-builder closed 1 year ago

ubuntu-server-builder commented 1 year ago

This bug was originally filed in Launchpad as LP: #1776958

Launchpad details
affected_projects = ['cloud-init (Ubuntu)']
assignee = paride
assignee_name = Paride Legovini
date_closed = 2020-11-24T17:58:32.663570+00:00
date_created = 2018-06-14T18:37:02.808491+00:00
date_fix_committed = 2020-08-28T02:59:13.772204+00:00
date_fix_released = 2020-11-24T17:58:32.663570+00:00
id = 1776958
importance = medium
is_complete = True
lp_url = https://bugs.launchpad.net/cloud-init/+bug/1776958
milestone = None
owner = smoser
owner_name = Scott Moser
private = False
status = fix_released
submitter = smoser
submitter_name = Scott Moser
tags = ['amd64', 'apport-bug', 'cosmic', 'uec-images']
duplicates = []

Launchpad user Scott Moser(smoser) wrote on 2018-06-14T18:37:02.808491+00:00

$ cat > my.yaml <<EOF

cloud-config

lxd: init: storage_backend: dir bridge: mode: new name: lxdbr0 ipv4_address: 10.100.100.1 ipv4_netmask: 24 ipv4_dhcp_first: 10.100.100.100 ipv4_dhcp_last: 10.100.100.200 ipv4_nat: true domain: lxd EOF

$ name=c1 $ lxc launch ubuntu-daily:cosmic $name "--config=user.user-data=$(cat my.yaml)" $ sleep 10 $ lxc exec $name cat /run/cloud-init/result.json { "v1": { "datasource": "DataSourceNoCloud [seed=/var/lib/cloud/seed/nocloud-net][dsmode=net]", "errors": [ "('lxd', ProcessExecutionError(\"Unexpected error while running command.\nCommand: ['lxc', 'network', 'create', 'lxdbr0', 'ipv4.address=10.100.100.1/24', 'ipv4.dhcp.ranges=10.100.100.100-10.100.100.200', 'ipv6.address=none', 'dns.domain=lxd', '--force-local']\nExit code: 1\nReason: -\nStdout: \nStderr: Error: The network already exists\",))" ] } }

The integration test case tests/cloud_tests/testcases/modules/lxd_bridge.py is demonstrates this failure on cosmic. It currently only occurs on cosmic but will occur anywhere with lxd 3.1.0.

ProblemType: Bug DistroRelease: Ubuntu 18.10 Package: cloud-init 18.2-64-gbbcc5e82-0ubuntu1 ProcVersionSignature: Ubuntu 4.15.0-22.24-generic 4.15.17 Uname: Linux 4.15.0-22-generic x86_64 ApportVersion: 2.20.10-0ubuntu3 Architecture: amd64 CloudName: LXD Date: Thu Jun 14 18:34:15 2018 PackageArchitecture: all ProcEnviron: TERM=xterm-256color PATH=(custom, no user) LANG=C.UTF-8 SourcePackage: cloud-init UpgradeStatus: No upgrade log present (probably fresh install) cloud-init-log-warnings: 2018-06-14 18:34:08,417 - util.py[WARNING]: Running module lxd (<module 'cloudinit.config.cc_lxd' from '/usr/lib/python3/dist-packages/cloudinit/config/cc_lxd.py'>) failed cloudinit.util.ProcessExecutionError: Unexpected error while running command. Stderr: Error: The network already exists

ubuntu-server-builder commented 1 year ago

Launchpad user Scott Moser(smoser) wrote on 2018-06-14T18:37:02.808491+00:00

Launchpad attachments: NonfreeKernelModules.txt,Dependencies.txt,ProcCpuinfoMinimal.txt,cloud-init-output.log.txt.txt,logs.tgz.gz,lshw.txt.txt,user_data.txt.txt

ubuntu-server-builder commented 1 year ago

Launchpad user Scott Moser(smoser) wrote on 2018-06-14T18:38:16.601059+00:00

The change in behavior of lxd init was filed as an upstream issue at https://github.com/lxc/lxd/issues/4649 .

ubuntu-server-builder commented 1 year ago

Launchpad user Stéphane Graber(stgraber) wrote on 2018-06-14T19:47:06.906209+00:00

The cloud-init integration would benefit from using the --preseed feature of "lxd init" moving forward (on anything that's >= 3.0) as that should let you do pretty much straight yaml passthrough to LXD and avoid having to update the cloud-init schema every time a new feature is added.

ubuntu-server-builder commented 1 year ago

Launchpad user Chad Smith(chad.smith) wrote on 2018-06-16T02:03:19.775129+00:00

An upstream commit landed for this bug.

To view that commit see the following URL: https://git.launchpad.net/cloud-init/commit/?id=4ce67201

ubuntu-server-builder commented 1 year ago

Launchpad user Launchpad Janitor(janitor) wrote on 2018-06-16T17:34:40.340347+00:00

This bug was fixed in the package cloud-init - 18.2-77-g4ce67201-0ubuntu1


cloud-init (18.2-77-g4ce67201-0ubuntu1) cosmic; urgency=medium

ubuntu-server-builder commented 1 year ago

Launchpad user Scott Moser(smoser) wrote on 2018-06-20T18:05:31.717628+00:00

This bug is believed to be fixed in cloud-init in version 18.3. If this is still a problem for you, please make a comment and set the state back to New

Thank you.

ubuntu-server-builder commented 1 year ago

Launchpad user Paride Legovini(paride) wrote on 2020-08-21T16:08:17.780987+00:00

This is happening again with Focal and Groovy:

tox -e citest -- run --os-name=focal --platform=lxd --preserve-data --data-dir=results --verbose --deb=cloud-init_20.2-134-g747723a4-1\~bddeb_all.deb --test-config=tests/cloud_tests/testcases/modules/lxd_bridge.yaml

=============================================

2020-08-21 15:56:11,115 - subp.py[DEBUG]: Running command ['lxd', 'waitready', '--timeout=300'] with allowed return codes [0] (shell=False, capture=True) 2020-08-21 15:56:20,165 - subp.py[DEBUG]: Running command ['lxd', 'init', '--auto', '--storage-backend=dir'] with allowed return codes [0] (shell=False, capture=True)
2020-08-21 15:56:23,684 - subp.py[DEBUG]: Running command ['lxc', 'network', 'delete', 'lxdbr0', '--force-local'] with allowed return codes [0] (shell=False, capture=True)
2020-08-21 15:56:24,413 - cc_lxd.py[DEBUG]: Deletion of lxd network 'lxdbr0' failed. Assuming it did not exist. 2020-08-21 15:56:24,413 - subp.py[DEBUG]: Running command ['lxc', 'profile', 'device', 'remove', 'default', 'eth0', '--force-local'] with allowed return codes [0] (shell=False, capture=True) 2020-08-21 15:56:24,491 - cc_lxd.py[DEBUG]: Removal of device 'eth0' from profile 'default' succeeded. 2020-08-21 15:56:24,491 - cc_lxd.py[DEBUG]: Creating lxd bridge: network create lxdbr0 ipv4.address=10.100.100.1/24 ipv4.dhcp.ranges=10.100.100.100-10.100.100.200 ipv6.address=none dns.domain=lxd
2020-08-21 15:56:24,492 - subp.py[DEBUG]: Running command ['lxc', 'network', 'create', 'lxdbr0', 'ipv4.address=10.100.100.1/24', 'ipv4.dhcp.ranges=10.100.100.100-10.100.100.200', 'ipv6.address=none', 'dns.domain=lxd', '--force-local'] with allowed return codes [0] (shell=False, capture=True) 2020-08-21 15:56:24,569 - handlers.py[DEBUG]: finish: modules-final/config-lxd: FAIL: running config-lxd with frequency once-per-instance 2020-08-21 15:56:24,569 - util.py[WARNING]: Running module lxd (<module 'cloudinit.config.cc_lxd' from '/usr/lib/python3/dist-packages/cloudinit/config/cc_lxd.py'>) failed
2020-08-21 15:56:24,569 - util.py[DEBUG]: Running module lxd (<module 'cloudinit.config.cc_lxd' from '/usr/lib/python3/dist-packages/cloudinit/config/cc_lxd.py'>) failed
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 848, in _run_modules ran, _r = cc.run(run_name, mod.handle, func_args,
File "/usr/lib/python3/dist-packages/cloudinit/cloud.py", line 54, in run
return self._runners.run(name, functor, args, freq, clear_on_fail)
File "/usr/lib/python3/dist-packages/cloudinit/helpers.py", line 185, in run
results = functor(*args)
File "/usr/lib/python3/dist-packages/cloudinit/config/cc_lxd.py", line 152, in handle _lxc(cmd_create)
File "/usr/lib/python3/dist-packages/cloudinit/config/cc_lxd.py", line 268, in _lxc subp.subp(['lxc'] + list(cmd) + ["--force-local"], update_env=env)
File "/usr/lib/python3/dist-packages/cloudinit/subp.py", line 290, in subp
raise ProcessExecutionError(stdout=out, stderr=err,
cloudinit.subp.ProcessExecutionError: Unexpected error while running command.
Command: ['lxc', 'network', 'create', 'lxdbr0', 'ipv4.address=10.100.100.1/24', 'ipv4.dhcp.ranges=10.100.100.100-10.100.100.200', 'ipv6.address=none', 'dns.domain=lxd', '--force-local']
Exit code: 1
Reason: -
Stdout:-
Stderr: Error: The network already exists

=============================================

So the cleanup code runs, but for some reason fails to delete the bridge. I tried running:

lxd init --auto --storage-backend=dir ; lxc network delete lxdbr0 --force-local

in a fresh Focal LXD container and it works (deletes lxdbr0), so I can't tell where the problem is yet.

ubuntu-server-builder commented 1 year ago

Launchpad user Paride Legovini(paride) wrote on 2020-08-21T16:12:06.452150+00:00

Was fixed by: https://github.com/canonical/cloud-init/commit/4ce67201

ubuntu-server-builder commented 1 year ago

Launchpad user Paride Legovini(paride) wrote on 2020-08-24T12:58:11.576997+00:00

It seems we require an extra step:

lxc network detach-profile lxdbr0 default

otherwise the deletion fails with:

Error: The network is currently in use.

ubuntu-server-builder commented 1 year ago

Launchpad user Paride Legovini(paride) wrote on 2020-08-24T15:07:14.844799+00:00

https://github.com/canonical/cloud-init/pull/542

ubuntu-server-builder commented 1 year ago

Launchpad user Chad Smith(chad.smith) wrote on 2020-08-28T02:59:02.593282+00:00

An upstream commit landed for this bug. Thanks Paride! https://github.com/canonical/cloud-init/commit/1f3a225af78dbfbff75c3faad28a5dc8cad0d1e3

ubuntu-server-builder commented 1 year ago

Launchpad user Launchpad Janitor(janitor) wrote on 2020-09-16T04:35:45.003437+00:00

This bug was fixed in the package cloud-init - 20.3-15-g6d332e5c-0ubuntu1


cloud-init (20.3-15-g6d332e5c-0ubuntu1) groovy; urgency=medium

ubuntu-server-builder commented 1 year ago

Launchpad user Chad Smith(chad.smith) wrote on 2020-11-24T17:58:35.139684+00:00

This bug is believed to be fixed in cloud-init in version 20.4. If this is still a problem for you, please make a comment and set the state back to New

Thank you.