Closed sfeole closed 6 years ago
What version of Ubuntu is used inside the container?
There's really no reason for LXD 3.1 to make any difference here so it's likely something else at play. We had cases (snapd) where udevadm would fail the first time bu succeed the second time so it may just be some kind of race.
It'd be good to know what udevadm is actually doing and failing on.
Hey Stephane, I'm using Bionic, but after further testing its looking more like this is some sort of race as you had suggested. I'm able to install and confgure ceph-osd on lxd 3.0 / bionic with no complaints via udevadm. Will close this out for now, as it would appear to be more of an issue with the charm than lxd.
$ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 18.04 LTS Release: 18.04 Codename: bionic
ubuntu@hotdog:~/openstack-on-lxd$ uname -a Linux hotdog 4.15.0-23-generic #25-Ubuntu SMP Wed May 23 17:59:52 UTC 2018 aarch64 aarch64 aarch64 GNU/Linux
$ lxc --version 3.0.0
Model Controller Cloud/Region Version SLA default localhost-localhost localhost/localhost 2.3.8 unsupported
App Version Status Scale Charm Store Rev OS Notes ceilometer 10.0.0 waiting 1 ceilometer jujucharms 253 ubuntu ceilometer-agent waiting 0 ceilometer-agent jujucharms 244 ubuntu ceph-mon 12.2.4 active 3 ceph-mon jujucharms 25 ubuntu ceph-osd 12.2.4 active 3 ceph-osd jujucharms 262 ubuntu ceph-radosgw 12.2.4 active 1 ceph-radosgw jujucharms 258 ubuntu
Unit Workload Agent Machine Public address Ports Message ceilometer/0 waiting idle 0 10.111.158.119 Incomplete relations: messaging ceph-mon/0 active idle 1 10.111.158.183 Unit is ready and clustered ceph-mon/1 active idle 2 10.111.158.173 Unit is ready and clustered ceph-mon/2 active idle 3 10.111.158.188 Unit is ready and clustered ceph-osd/0 active idle 4 10.111.158.134 Unit is ready (1 OSD) ceph-osd/1 active idle 5 10.111.158.33 Unit is ready (1 OSD) ceph-osd/2 active idle 6 10.111.158.213 Unit is ready (1 OSD) ceph-radosgw/0 active idle 7 10.111.158.43 80/tcp Unit is ready
related launchpad bug report
@stgraber It's about this set of rules: https://github.com/openstack/charm-ceph-osd/blob/master/files/udev/95-charm-ceph-osd.rules
LXC 3.1 makes no difference.
Each container spawned by OpenStack on LXC bundle has the same problem. udevadm control --reload-rules
will exit with status code 2.
Specific about ceph-osd charm is that method _install_udevrules introduced in 901b873 will call _subprocess.checkcall that would raise an exception if the sub-process returns non-zero code. If subprocess.call(...)
is called instead the deployment will succseed. Since this issue is showstopper I submitted the patch although it's hackish.
There is only one noticable problem revailed in journalctl mount: can’t find LABEL=cloudimg-rootfs
. Calling mount will fail with the same error.
sudo mount -o remount /
mount: can't find LABEL=cloudimg-rootfs
Note Process '/sbin/blkid -o udev -p /dev/sdb' failed with exit code 2.
sudo udevadm test /sys/class/block/sdb
calling: test
version 229
This program is for debugging only, it does not run any program
specified by a RUN key. It may show incorrect results, because
some values may be different, or not available at a simulation run.
=== trie on-disk ===
tool version: 229
file size: 6841778 bytes
header size 80 bytes
strings 1755242 bytes
nodes 5086456 bytes
Load module index
timestamp of '/etc/systemd/network' changed
timestamp of '/lib/systemd/network' changed
Parsed configuration file /lib/systemd/network/99-default.link
Created link configuration context.
timestamp of '/etc/udev/rules.d' changed
timestamp of '/lib/udev/rules.d' changed
Reading rules file: /lib/udev/rules.d/40-bridge-network-interface.rules
Reading rules file: /lib/udev/rules.d/40-vm-hotadd.rules
Reading rules file: /lib/udev/rules.d/50-apport.rules
Reading rules file: /lib/udev/rules.d/50-firmware.rules
Reading rules file: /lib/udev/rules.d/50-rbd.rules
Reading rules file: /lib/udev/rules.d/50-udev-default.rules
Reading rules file: /lib/udev/rules.d/55-dm.rules
Reading rules file: /lib/udev/rules.d/56-lvm.rules
Reading rules file: /lib/udev/rules.d/60-block.rules
Reading rules file: /lib/udev/rules.d/60-cdrom_id.rules
Reading rules file: /lib/udev/rules.d/60-ceph-by-parttypeuuid.rules
Reading rules file: /lib/udev/rules.d/60-drm.rules
Reading rules file: /lib/udev/rules.d/60-evdev.rules
Reading rules file: /lib/udev/rules.d/60-gnupg.rules
Reading rules file: /lib/udev/rules.d/60-open-vm-tools.rules
Reading rules file: /lib/udev/rules.d/60-persistent-alsa.rules
Reading rules file: /lib/udev/rules.d/60-persistent-input.rules
Reading rules file: /lib/udev/rules.d/60-persistent-storage-dm.rules
Reading rules file: /lib/udev/rules.d/60-persistent-storage-tape.rules
Reading rules file: /lib/udev/rules.d/60-persistent-storage.rules
Reading rules file: /lib/udev/rules.d/60-persistent-v4l.rules
Reading rules file: /lib/udev/rules.d/60-serial.rules
Reading rules file: /lib/udev/rules.d/60-vlan-network-interface.rules
Reading rules file: /lib/udev/rules.d/61-persistent-storage-android.rules
Reading rules file: /lib/udev/rules.d/63-md-raid-arrays.rules
Reading rules file: /lib/udev/rules.d/64-btrfs.rules
Reading rules file: /lib/udev/rules.d/64-md-raid-assembly.rules
Reading rules file: /lib/udev/rules.d/66-azure-ephemeral.rules
Reading rules file: /lib/udev/rules.d/66-snapd-autoimport.rules
Reading rules file: /lib/udev/rules.d/69-bcache.rules
Reading rules file: /lib/udev/rules.d/69-lvm-metad.rules
Reading rules file: /lib/udev/rules.d/70-debian-uaccess.rules
Reading rules file: /lib/udev/rules.d/70-iscsi-network-interface.rules
Reading rules file: /lib/udev/rules.d/70-mouse.rules
Skipping empty file: /etc/udev/rules.d/70-persistent-net.rules
Reading rules file: /lib/udev/rules.d/70-power-switch.rules
Reading rules file: /lib/udev/rules.d/70-resolvconf-initramfs-copy.rules
Reading rules file: /lib/udev/rules.d/70-uaccess.rules
Reading rules file: /lib/udev/rules.d/71-power-switch-proliant.rules
Reading rules file: /lib/udev/rules.d/71-seat.rules
Reading rules file: /lib/udev/rules.d/73-seat-late.rules
Reading rules file: /lib/udev/rules.d/73-special-net-names.rules
Reading rules file: /lib/udev/rules.d/73-usb-net-by-mac.rules
Reading rules file: /lib/udev/rules.d/75-net-description.rules
Reading rules file: /lib/udev/rules.d/75-probe_mtd.rules
Reading rules file: /lib/udev/rules.d/78-graphics-card.rules
Reading rules file: /lib/udev/rules.d/78-sound-card.rules
Reading rules file: /lib/udev/rules.d/80-debian-compat.rules
Reading rules file: /lib/udev/rules.d/80-drivers.rules
Reading rules file: /lib/udev/rules.d/80-ifupdown.rules
Reading rules file: /lib/udev/rules.d/80-net-setup-link.rules
Reading rules file: /lib/udev/rules.d/85-hdparm.rules
Reading rules file: /lib/udev/rules.d/85-keyboard-configuration.rules
Reading rules file: /lib/udev/rules.d/95-ceph-osd.rules
Reading rules file: /lib/udev/rules.d/95-charm-ceph-osd.rules
Reading rules file: /lib/udev/rules.d/99-systemd.rules
Reading rules file: /lib/udev/rules.d/99-vmware-scsi-udev.rules
rules contain 49152 bytes tokens (4096 * 12 bytes), 15701 bytes strings
2492 strings (32804 bytes), 1726 de-duplicated (17870 bytes), 767 trie nodes used
value '[dmi/id]sys_vendor' is 'Microsoft Corporation'
value '[dmi/id]product_name' is 'Virtual Machine'
GROUP 6 /lib/udev/rules.d/50-udev-default.rules:55
IMPORT '/sbin/blkid -o udev -p /dev/sdb' /lib/udev/rules.d/60-ceph-by-parttypeuuid.rules:26
starting '/sbin/blkid -o udev -p /dev/sdb'
'/sbin/blkid -o udev -p /dev/sdb'(err) 'error: /dev/sdb: No such file or directory'
Process '/sbin/blkid -o udev -p /dev/sdb' failed with exit code 2.
IMPORT 'scsi_id --export --whitelisted -d /dev/sdb' /lib/udev/rules.d/60-persistent-storage.rules:44
starting 'scsi_id --export --whitelisted -d /dev/sdb'
'scsi_id --export --whitelisted -d /dev/sdb'(out) 'ID_SCSI=1'
'scsi_id --export --whitelisted -d /dev/sdb'(out) 'ID_VENDOR='
'scsi_id --export --whitelisted -d /dev/sdb'(out) 'ID_VENDOR_ENC='
'scsi_id --export --whitelisted -d /dev/sdb'(out) 'ID_MODEL='
'scsi_id --export --whitelisted -d /dev/sdb'(out) 'ID_MODEL_ENC='
'scsi_id --export --whitelisted -d /dev/sdb'(out) 'ID_REVISION='
'scsi_id --export --whitelisted -d /dev/sdb'(out) 'ID_TYPE='
Process 'scsi_id --export --whitelisted -d /dev/sdb' succeeded.
IMPORT builtin 'path_id' /lib/udev/rules.d/60-persistent-storage.rules:64
LINK 'disk/by-path/acpi-VMBUS:01-scsi-0:0:0:0' /lib/udev/rules.d/60-persistent-storage.rules:65
IMPORT builtin 'blkid' /lib/udev/rules.d/60-persistent-storage.rules:76
Failure opening block device /dev/sdb: No such file or directory
IMPORT builtin 'blkid' returned non-zero
RUN '/usr/bin/unshare -m /usr/bin/snap auto-import --mount=/dev/%k' /lib/udev/rules.d/66-snapd-autoimport.rules:3
IMPORT 'probe-bcache -o udev /dev/sdb' /lib/udev/rules.d/69-bcache.rules:16
starting 'probe-bcache -o udev /dev/sdb'
Process 'probe-bcache -o udev /dev/sdb' succeeded.
RUN '/lib/udev/hdparm' /lib/udev/rules.d/85-hdparm.rules:1
handling device node '/dev/sdb', devnum=b8:16, mode=0660, uid=0, gid=6
can not stat() node '/dev/sdb' (No such file or directory)
created db file '/run/udev/data/b8:16' for '/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/263998ea-f9c7-40df-b4e1-dfbbc74a269a/host3/target3:0:0/3:0:0:0/block/sdb'
.ID_FS_TYPE_NEW=
ACTION=add
DEVLINKS=/dev/disk/by-path/acpi-VMBUS:01-scsi-0:0:0:0
DEVNAME=/dev/sdb
DEVPATH=/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/263998ea-f9c7-40df-b4e1-dfbbc74a269a/host3/target3:0:0/3:0:0:0/block/sdb
DEVTYPE=disk
ID_BUS=scsi
ID_FS_TYPE=
ID_PATH=acpi-VMBUS:01-scsi-0:0:0:0
ID_PATH_TAG=acpi-VMBUS_01-scsi-0_0_0_0
ID_SCSI=1
MAJOR=8
MINOR=16
SUBSYSTEM=block
TAGS=:systemd:
USEC_INITIALIZED=441738610403
run: '/usr/bin/unshare -m /usr/bin/snap auto-import --mount=/dev/sdb'
run: '/lib/udev/hdparm'
Unload module index
Unloaded link configuration context.
@schkovich does emptying the fstab file in the container fix this issue?
No difference @stgraber
$ sudo cp /dev/null /etc/fstab
$ exit
logout
Connection to 10.x.x.x closed.
$ juju resolved ceph-osd/0
$ juju debug-log --include ceph-osd/0
unit-ceph-osd-0: 15:15:36 DEBUG unit.ceph-osd/0.install Traceback (most recent call last):
unit-ceph-osd-0: 15:15:36 DEBUG unit.ceph-osd/0.install File "/var/lib/juju/agents/unit-ceph-osd-0/charm/hooks/install.real", line 630, in <module>
unit-ceph-osd-0: 15:15:36 DEBUG unit.ceph-osd/0.install hooks.execute(sys.argv)
unit-ceph-osd-0: 15:15:36 DEBUG unit.ceph-osd/0.install File "/var/lib/juju/agents/unit-ceph-osd-0/charm/hooks/charmhelpers/core/hookenv.py", line 823, in execute
unit-ceph-osd-0: 15:15:36 DEBUG unit.ceph-osd/0.install self._hooks[hook_name]()
unit-ceph-osd-0: 15:15:36 DEBUG unit.ceph-osd/0.install File "/var/lib/juju/agents/unit-ceph-osd-0/charm/hooks/charmhelpers/contrib/hardening/harden.py", line 79, in _harden_inner2
unit-ceph-osd-0: 15:15:36 DEBUG unit.ceph-osd/0.install return f(*args, **kwargs)
unit-ceph-osd-0: 15:15:36 DEBUG unit.ceph-osd/0.install File "/var/lib/juju/agents/unit-ceph-osd-0/charm/hooks/install.real", line 242, in install
unit-ceph-osd-0: 15:15:36 DEBUG unit.ceph-osd/0.install install_udev_rules()
unit-ceph-osd-0: 15:15:36 DEBUG unit.ceph-osd/0.install File "/var/lib/juju/agents/unit-ceph-osd-0/charm/hooks/install.real", line 231, in install_udev_rules
unit-ceph-osd-0: 15:15:36 DEBUG unit.ceph-osd/0.install '--reload-rules'])
unit-ceph-osd-0: 15:15:36 DEBUG unit.ceph-osd/0.install File "/usr/lib/python3.5/subprocess.py", line 581, in check_call
unit-ceph-osd-0: 15:15:36 DEBUG unit.ceph-osd/0.install raise CalledProcessError(retcode, cmd)
unit-ceph-osd-0: 15:15:36 DEBUG unit.ceph-osd/0.install subprocess.CalledProcessError: Command '['udevadm', 'control', '--reload-rules']' returned non-zero exit status 2
unit-ceph-osd-0: 15:15:36 ERROR juju.worker.uniter.operation hook "install" failed: exit status 1
$ sudo udevadm control --reload-rules
$ echo $?
2
Command sudo mount -o remount /
did not fail after removing LABEL=cloudimg-rootfs / ext4 defaults 0 0
from /etc/fstab as expected.
James Page is saying that proposed oneliner is good enough.
Bionic 18.04 Linux d05-4 4.15.0-23-generic #25-Ubuntu SMP Wed May 23 17:59:52 UTC 2018 aarch64 aarch64 aarch64 GNU/Linux LXD version 3.0 zfs storage backend in use.
Starting with ceph-osd-262 the addition of a udevadm command during storage.real causes install to fail when performed in a LXD container.
The return code from udevadm control --reload-rules is '2' when run in a LXD even when set to priviliaged and nesting enabled.
Debug Log showing the error: unit-ceph-osd-0: 16:33:15 DEBUG unit.ceph-osd/0.install Traceback (most recent call last): unit-ceph-osd-0: 16:33:15 DEBUG unit.ceph-osd/0.install File "/var/lib/juju/agents/unit-ceph-osd-0/charm/hooks/install.real", line 630, in
unit-ceph-osd-0: 16:33:15 DEBUG unit.ceph-osd/0.install hooks.execute(sys.argv)
unit-ceph-osd-0: 16:33:15 DEBUG unit.ceph-osd/0.install File "/var/lib/juju/agents/unit-ceph-osd-0/charm/hooks/charmhelpers/core/hookenv.py", line 823, in execute
unit-ceph-osd-0: 16:33:15 DEBUG unit.ceph-osd/0.install self._hooks[hook_name]()
unit-ceph-osd-0: 16:33:15 DEBUG unit.ceph-osd/0.install File "/var/lib/juju/agents/unit-ceph-osd-0/charm/hooks/charmhelpers/contrib/hardening/harden.py", line 79, in _harden_inner2
unit-ceph-osd-0: 16:33:15 DEBUG unit.ceph-osd/0.install return f(*args, **kwargs)
unit-ceph-osd-0: 16:33:15 DEBUG unit.ceph-osd/0.install File "/var/lib/juju/agents/unit-ceph-osd-0/charm/hooks/install.real", line 242, in install
unit-ceph-osd-0: 16:33:15 DEBUG unit.ceph-osd/0.install install_udev_rules()
unit-ceph-osd-0: 16:33:15 DEBUG unit.ceph-osd/0.install File "/var/lib/juju/agents/unit-ceph-osd-0/charm/hooks/install.real", line 231, in install_udev_rules
unit-ceph-osd-0: 16:33:15 DEBUG unit.ceph-osd/0.install '--reload-rules'])
unit-ceph-osd-0: 16:33:15 DEBUG unit.ceph-osd/0.install File "/usr/lib/python3.5/subprocess.py", line 581, in check_call
unit-ceph-osd-0: 16:33:15 DEBUG unit.ceph-osd/0.install raise CalledProcessError(retcode, cmd)
unit-ceph-osd-0: 16:33:15 DEBUG unit.ceph-osd/0.install subprocess.CalledProcessError: Command '['udevadm', 'control', '--reload-rules']' returned non-zero exit status 2
unit-ceph-osd-0: 16:33:15 ERROR juju.worker.uniter.operation hook "install" failed: exit status 1
1.) Install Bionic 18.04 2.) Using the shipping version of LXD (3.0) configure and bootstrap a juju localhost controller 3.) Deploy openstack-on-lxd ( https://docs.openstack.org/charm-guide/latest/openstack-on-lxd.html ) 4.) ceph-osd will fail to complete install.
Workaround:
Install LXD 3.1 from the snap store, --stable. ceph-osd 262 installs properly
For additional logs, please see: Please see: https://bugs.launchpad.net/charm-ceph-osd/+bug/1776713