Open jschmid1 opened 5 years ago
Ideally, DeepSea will still work with earlier versions of nautilus even after this is fixed.
maybe @jan--f can confirm my assumption regarding the logging changes.
14.2.3 ceph-volume should only print logging messages to stderr. I guess the runner returns stderr? I'll look into it.
hmm I just saw this: I ran salt '*' cmd.run 'for d in b c d e f; do ceph-volume lvm zap --destroy /dev/vd$d; done'
from the salt master and got output like
data2-6.virt1.home.fajerski.name:
--> Zapping: /dev/vdb
--> Zapping: /dev/vdc
--> Zapping: /dev/vdd
--> Zapping: /dev/vde
--> Zapping: /dev/vdf
But a subsequent lsblk
revealed that no LV got zapped. Running the same command directly on the minion (minus the salt part of course) zapped the disks just fine. No idea what is going on here, will investigate more tomorrow.
Here is the issue
[2019-09-17 11:01:39,104][ceph_volume][ERROR ] exception caught by decorator
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 59, in newfunc
return f(*a, **kw)
File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 148, in main
terminal.dispatch(self.mapper, subcommand_args)
File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 205, in dispatch
instance.main()
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/main.py", line 40, in main
terminal.dispatch(self.mapper, self.argv)
File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 205, in dispatch
instance.main()
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/zap.py", line 355, in main
self.zap()
File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 16, in is_root
return func(*a, **kw)
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/zap.py", line 233, in zap
self.zap_lvm_member(device)
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/zap.py", line 198, in zap_lvm_member
self.zap_lv(Device(lv.lv_path))
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/zap.py", line 144, in zap_lv
self.unmount_lv(lv)
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/zap.py", line 133, in unmount_lv
mlogger.info("Unmounting %s", lv_path)
File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 190, in info
info(record)
File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 142, in info
return _Write(prefix=blue_arrow).raw(msg)
File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 117, in raw
self.write(string)
File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 120, in write
self._writer.write(self.prefix + line + self.suffix)
ValueError: I/O operation on closed file.
so, it would seem, that in the CI tests which are now passing with the tmp fix, the OSDs are being removed but the underlying disk is not really getting zapped (AFAIK the tests do not include any logic for verifying the zap)
so, it would seem, that in the CI tests which are now passing with the tmp fix, the OSDs are being removed but the underlying disk is not really getting zapped (AFAIK the tests do not include any logic for verifying the zap)
that would be entirely plausible.
ok I can confirm that the ceph build fixes this issue. However there still seems to somehting up with purge, where the OSDs are not stopped, but once they are stopped, zapping them works just fine.
salt 'data1*' cmd.run 'for d in b c d e f; do ceph-volume lvm zap --destroy /dev/vd$d; echo $?; done'
data1-6.virt1.home.fajerski.name:
--> Zapping: /dev/vdb
--> Unmounting /var/lib/ceph/osd/ceph-3
Running command: /bin/umount -v /var/lib/ceph/osd/ceph-3
stderr: umount: /var/lib/ceph/osd/ceph-3 unmounted
Running command: /usr/sbin/wipefs --all /dev/ceph-58736349-a5d6-4966-9598-f7ed4082441b/osd-data-bcb8791a-954d-417b-b016-fa08d9a62885
Running command: /bin/dd if=/dev/zero of=/dev/ceph-58736349-a5d6-4966-9598-f7ed4082441b/osd-data-bcb8791a-954d-417b-b016-fa08d9a62885 bs=1M count=10
--> Only 1 LV left in VG, will proceed to destroy volume group ceph-58736349-a5d6-4966-9598-f7ed4082441b
Running command: /usr/sbin/vgremove -v -f ceph-58736349-a5d6-4966-9598-f7ed4082441b
stderr: Removing ceph--58736349--a5d6--4966--9598--f7ed4082441b-osd--data--bcb8791a--954d--417b--b016--fa08d9a62885 (253:2)
stderr: Archiving volume group "ceph-58736349-a5d6-4966-9598-f7ed4082441b" metadata (seqno 21).
Releasing logical volume "osd-data-bcb8791a-954d-417b-b016-fa08d9a62885"
stderr: Creating volume group backup "/etc/lvm/backup/ceph-58736349-a5d6-4966-9598-f7ed4082441b" (seqno 22).
stdout: Logical volume "osd-data-bcb8791a-954d-417b-b016-fa08d9a62885" successfully removed
stderr: Removing physical volume "/dev/vdb" from volume group "ceph-58736349-a5d6-4966-9598-f7ed4082441b"
stdout: Volume group "ceph-58736349-a5d6-4966-9598-f7ed4082441b" successfully removed
Running command: /usr/sbin/wipefs --all /dev/vdb
stdout: /dev/vdb: 8 bytes were erased at offset 0x00000218 (LVM2_member): 4c 56 4d 32 20 30 30 31
Running command: /bin/dd if=/dev/zero of=/dev/vdb bs=1M count=10
--> Zapping successful for: <Raw Device: /dev/vdb>
0
--> Zapping: /dev/vdc
--> Unmounting /var/lib/ceph/osd/ceph-9
Running command: /bin/umount -v /var/lib/ceph/osd/ceph-9
stderr: umount: /var/lib/ceph/osd/ceph-9 unmounted
Running command: /usr/sbin/wipefs --all /dev/ceph-df617cae-42bb-43ca-97ba-0d01b8ef2d39/osd-data-b8281429-e4d0-4d6e-ac64-38dae8d1a270
Running command: /bin/dd if=/dev/zero of=/dev/ceph-df617cae-42bb-43ca-97ba-0d01b8ef2d39/osd-data-b8281429-e4d0-4d6e-ac64-38dae8d1a270 bs=1M count=10
--> Only 1 LV left in VG, will proceed to destroy volume group ceph-df617cae-42bb-43ca-97ba-0d01b8ef2d39
Running command: /usr/sbin/vgremove -v -f ceph-df617cae-42bb-43ca-97ba-0d01b8ef2d39
stderr: Removing ceph--df617cae--42bb--43ca--97ba--0d01b8ef2d39-osd--data--b8281429--e4d0--4d6e--ac64--38dae8d1a270 (253:4)
stderr: Archiving volume group "ceph-df617cae-42bb-43ca-97ba-0d01b8ef2d39" metadata (seqno 21).
Releasing logical volume "osd-data-b8281429-e4d0-4d6e-ac64-38dae8d1a270"
stderr: Creating volume group backup "/etc/lvm/backup/ceph-df617cae-42bb-43ca-97ba-0d01b8ef2d39" (seqno 22).
stdout: Logical volume "osd-data-b8281429-e4d0-4d6e-ac64-38dae8d1a270" successfully removed
stderr: Removing physical volume "/dev/vdc" from volume group "ceph-df617cae-42bb-43ca-97ba-0d01b8ef2d39"
stdout: Volume group "ceph-df617cae-42bb-43ca-97ba-0d01b8ef2d39" successfully removed
Running command: /usr/sbin/wipefs --all /dev/vdc
stdout: /dev/vdc: 8 bytes were erased at offset 0x00000218 (LVM2_member): 4c 56 4d 32 20 30 30 31
Running command: /bin/dd if=/dev/zero of=/dev/vdc bs=1M count=10
--> Zapping successful for: <Raw Device: /dev/vdc>
0
--> Zapping: /dev/vdd
--> Unmounting /var/lib/ceph/osd/ceph-14
Running command: /bin/umount -v /var/lib/ceph/osd/ceph-14
stderr: umount: /var/lib/ceph/osd/ceph-14 unmounted
Running command: /usr/sbin/wipefs --all /dev/ceph-686e5dc6-00e8-46ba-b109-d93623a9f60d/osd-data-65a0439a-9543-4e7b-a94f-11a2ff373241
Running command: /bin/dd if=/dev/zero of=/dev/ceph-686e5dc6-00e8-46ba-b109-d93623a9f60d/osd-data-65a0439a-9543-4e7b-a94f-11a2ff373241 bs=1M count=10
--> Only 1 LV left in VG, will proceed to destroy volume group ceph-686e5dc6-00e8-46ba-b109-d93623a9f60d
Running command: /usr/sbin/vgremove -v -f ceph-686e5dc6-00e8-46ba-b109-d93623a9f60d
stderr: Removing ceph--686e5dc6--00e8--46ba--b109--d93623a9f60d-osd--data--65a0439a--9543--4e7b--a94f--11a2ff373241 (253:0)
stderr: Archiving volume group "ceph-686e5dc6-00e8-46ba-b109-d93623a9f60d" metadata (seqno 21).
Releasing logical volume "osd-data-65a0439a-9543-4e7b-a94f-11a2ff373241"
stderr: Creating volume group backup "/etc/lvm/backup/ceph-686e5dc6-00e8-46ba-b109-d93623a9f60d" (seqno 22).
stdout: Logical volume "osd-data-65a0439a-9543-4e7b-a94f-11a2ff373241" successfully removed
stderr: Removing physical volume "/dev/vdd" from volume group "ceph-686e5dc6-00e8-46ba-b109-d93623a9f60d"
stdout: Volume group "ceph-686e5dc6-00e8-46ba-b109-d93623a9f60d" successfully removed
Running command: /usr/sbin/wipefs --all /dev/vdd
stdout: /dev/vdd: 8 bytes were erased at offset 0x00000218 (LVM2_member): 4c 56 4d 32 20 30 30 31
Running command: /bin/dd if=/dev/zero of=/dev/vdd bs=1M count=10
--> Zapping successful for: <Raw Device: /dev/vdd>
0
--> Zapping: /dev/vde
Running command: /usr/sbin/wipefs --all /dev/vde
stdout: /dev/vde: 8 bytes were erased at offset 0x00000218 (LVM2_member): 4c 56 4d 32 20 30 30 31
Running command: /bin/dd if=/dev/zero of=/dev/vde bs=1M count=10
--> Zapping successful for: <Raw Device: /dev/vde>
0
--> Zapping: /dev/vdf
--> Unmounting /var/lib/ceph/osd/ceph-24
Running command: /bin/umount -v /var/lib/ceph/osd/ceph-24
stderr: umount: /var/lib/ceph/osd/ceph-24 unmounted
Running command: /usr/sbin/wipefs --all /dev/ceph-860f342f-8a0c-4746-a906-0eb32c9847dc/osd-data-9c48ec80-46f9-4184-ad77-fd4ac6e94126
Running command: /bin/dd if=/dev/zero of=/dev/ceph-860f342f-8a0c-4746-a906-0eb32c9847dc/osd-data-9c48ec80-46f9-4184-ad77-fd4ac6e94126 bs=1M count=10
--> Only 1 LV left in VG, will proceed to destroy volume group ceph-860f342f-8a0c-4746-a906-0eb32c9847dc
Running command: /usr/sbin/vgremove -v -f ceph-860f342f-8a0c-4746-a906-0eb32c9847dc
stderr: Removing ceph--860f342f--8a0c--4746--a906--0eb32c9847dc-osd--data--9c48ec80--46f9--4184--ad77--fd4ac6e94126 (253:1)
stderr: Archiving volume group "ceph-860f342f-8a0c-4746-a906-0eb32c9847dc" metadata (seqno 21).
Releasing logical volume "osd-data-9c48ec80-46f9-4184-ad77-fd4ac6e94126"
stderr: Creating volume group backup "/etc/lvm/backup/ceph-860f342f-8a0c-4746-a906-0eb32c9847dc" (seqno 22).
stdout: Logical volume "osd-data-9c48ec80-46f9-4184-ad77-fd4ac6e94126" successfully removed
stderr: Removing physical volume "/dev/vdf" from volume group "ceph-860f342f-8a0c-4746-a906-0eb32c9847dc"
stdout: Volume group "ceph-860f342f-8a0c-4746-a906-0eb32c9847dc" successfully removed
Running command: /usr/sbin/wipefs --all /dev/vdf
stdout: /dev/vdf: 8 bytes were erased at offset 0x00000218 (LVM2_member): 4c 56 4d 32 20 30 30 31
Running command: /bin/dd if=/dev/zero of=/dev/vdf bs=1M count=10
--> Zapping successful for: <Raw Device: /dev/vdf>
0
I can also confirm that this problem doesn't happen with 14.2.4, which would indicate this just another symptom of the ceph-volume regression that found its way into 14.2.3.
. . . and since users on SUSE will not see 14.2.3, there's no reason for DeepSea to do anything special to work around that ceph-volume regression.
ceph version 14.2.3-349-g7b1552ea82 (7b1552ea827cf5167b6edbba96dd1c4a9dc16937) nautilus (stable)
salt-run osd.remove $id
uses
ceph-volume lvm zap --osd-id $id --destroy
to zap a disk remotely on the minion.In previous releases we expected a string
Zapping successful for OSD
in the return message.With this release we get:
--> Zapping: /dev/ceph-a8e4a78d-e3....
Since there are no significant changes that would indicate a change in the return string I assume it's due to the changes in logging in recent commits. mlogger vs terminal.success
This raises the question if invoking shell commands is the right way when we have an python api to use. A couple of things need to be verified though.
1) Is the zap command consumable via the API 2) Does it return meaningful messages 3) Is it more efficient since it needs to be wrapped in a minion module to be called from the master (via a runner)