redhat-cip / puppet-ceph

Deploy Ceph using puppet
http://ceph.com/
Other
69 stars 73 forks source link

changes to single osd restart all osds on a server, and naming /dev/sdX devices #24

Open cernceph opened 11 years ago

cernceph commented 11 years ago

Today we have been testing drive failing/replacement and noticed a couple short-comings in the device.pp manifest:

  1. When a disk is replaced, the ceph.conf will change and this results in a service restart of all the osd's in a server. (because of the subscribe => Concat /etc/ceph/ceph.conf in each osd service). These restarts result in a noticeable disruption. Ideally we want only to start the affected service, not all of them!
  2. Using the /dev/sdX names for disks isn't ideal, since when a replacement drive is inserted it will get a new name (e.g. today we pulled sdq, then reinserted it and it got sdab). We then need to do one of (a) change our host manifests to add osd::device (sdab), but this isn't good since the device will return to sdq after a reboot, or (b) reboot the server, to get the device called sdq once again.

Do people have experience already with better practices to prevent these two problems?? Help is much appreciated!

Cheers, Dan CERN IT

dotwaffle commented 11 years ago

Would it be possible to reference by UUID? Or indeed, not partition. Useful when you're just wanting to try on a spare LVM LV :)

M

cernceph commented 11 years ago

Sure we could use /dev/disk/by-id or similar, but we would need to patch device.pp since it seems to expect /dev/sdX with lines like:

   $devname = regsubst($name, '.*/', '')

and

    command => "mkfs.xfs -f -d agcount=${::processorcount} -l \
size=1024m -n size=64k ${name}1",

I am wondering what people are using on clusters today... or is noone using this module in pseudo-production?

dotwaffle commented 11 years ago

Perhaps just trip the "1" from the ${name}1, and specify a partition directly. Then, use a third party module to partition your disk -- or, in the case of moving to btrfs which can (IIRC) use a whole disk without a partition table being present, just use the disk "raw" as it were.

Just some thoughts, anyway!