Closed simondeziel closed 5 months ago
In a CI script set to abort on errors, the microceph disk add --wipe encountered a errors but didn't return != 0:
microceph disk add --wipe
+ sudo microceph disk add --wipe /dev/sdb +----------+---------+ | PATH | STATUS | +----------+---------+ | /dev/sdb | Failure | +----------+---------+ Error: failed to bootstrap OSD: Failed to run: ceph-osd --mkfs --no-mon-config -i 1: exit status 250 (2024-01-29T22:30:14.664+0000 7f1570e998c0 -1 bluestore(/var/lib/ceph/osd/ceph-1/block) _read_bdev_label unable to decode label at offset 102: void bluestore_bdev_label_t::decode(ceph::buffer::v15_2_0::list::const_iterator&) decode past end of struct encoding: Malformed input 2024-01-29T22:30:14.664+0000 7f1570e998c0 -1 bluestore(/var/lib/ceph/osd/ceph-1/block) _read_bdev_label unable to decode label at offset 102: void bluestore_bdev_label_t::decode(ceph::buffer::v15_2_0::list::const_iterator&) decode past end of struct encoding: Malformed input 2024-01-29T22:30:14.664+0000 7f1570e998c0 -1 bluestore(/var/lib/ceph/osd/ceph-1/block) _read_bdev_label unable to decode label at offset 102: void bluestore_bdev_label_t::decode(ceph::buffer::v15_2_0::list::const_iterator&) decode past end of struct encoding: Malformed input 2024-01-29T22:30:14.676+0000 7f1570e998c0 -1 bdev(0x563a03578000 /var/lib/ceph/osd/ceph-1/block) open open got: (16) Device or resource busy 2024-01-29T22:30:14.676+0000 7f1570e998c0 -1 bluestore(/var/lib/ceph/osd/ceph-1) mkfs failed, (16) Device or resource busy 2024-01-29T22:30:14.676+0000 7f1570e998c0 -1 OSD::mkfs: ObjectStore::mkfs failed with error (16) Device or resource busy 2024-01-29T22:30:14.676+0000 7f1570e998c0 -1 ** ERROR: error creating empty object store in /var/lib/ceph/osd/ceph-1: (16) Device or resource busy) + sudo rm -rf /etc/ceph + sudo ln -s /var/snap/microceph/current/conf/ /etc/ceph ... + sudo microceph.ceph status cluster: id: 594f8038-eb9d-4381-9707-4a622a23fd97 health: HEALTH_WARN 1 MDSs report slow metadata IOs nobackfill,norebalance,norecover,noscrub,nodeep-scrub,nosnaptrim flag(s) set Reduced data availability: 65 pgs inactive 3 pool(s) have no replicas configured OSD count 0 < osd_pool_default_size 1 services: mon: 1 daemons, quorum fv-az665-985 (age 2m) mgr: fv-az665-985(active, since 2m) mds: 1/1 daemons up osd: 0 osds: 0 up, 0 in flags nobackfill,norebalance,norecover,noscrub,nodeep-scrub,nosnaptrim data: volumes: 1/1 healthy pools: 3 pools, 65 pgs objects: 0 objects, 0 B usage: 0 B used, 0 B / 0 B avail pgs: 100.000% pgs unknown 65 unknown ...
$ sudo snap install microceph --edge microceph (reef/edge) 18.2.0+snape56a71f5dd from Canonical** installed
https://github.com/canonical/lxd/actions/runs/7703310904/workflow?pr=12783#L270-L319 has it all but essentially:
sudo snap install microceph --edge
sudo swapoff /mnt/swapfile
sudo umount /mnt
sudo microceph disk add --wipe "${ephemeral_disk}"
microceph disk add --wipe returned 0 despite running into errors.
microceph disk add --wipe should return != 0 on error.
https://github.com/canonical/lxd/actions/runs/7703310904/job/20993429312?pr=12783#step:10:328
Thanks a lot for reporting this bug @simondeziel. This was fixed by #291 Marking this issue closed.
@UtkarshBhatthere many thanks for the quick turnaround!
Issue report
In a CI script set to abort on errors, the
microceph disk add --wipe
encountered a errors but didn't return != 0:What version of MicroCeph are you using ?
What are the steps to reproduce this issue ?
https://github.com/canonical/lxd/actions/runs/7703310904/workflow?pr=12783#L270-L319 has it all but essentially:
sudo snap install microceph --edge
sudo swapoff /mnt/swapfile
sudo umount /mnt
# umount the ephemeral disk of GitHub Action runnersudo microceph disk add --wipe "${ephemeral_disk}"
# try to give the ephemeral disk to microcephWhat happens (observed behaviour) ?
microceph disk add --wipe
returned 0 despite running into errors.What were you expecting to happen ?
microceph disk add --wipe
should return != 0 on error.Relevant logs, error output, etc.
https://github.com/canonical/lxd/actions/runs/7703310904/job/20993429312?pr=12783#step:10:328