How to handle osd disk failure?

ceph / ceph-cookbook

Chef cookbooks for Ceph

Apache License 2.0

100 stars 108 forks source link

How to handle osd disk failure? #209

Open krenakzsolt opened 8 years ago

krenakzsolt commented 8 years ago

Hi All!

I was thinking about how the coobook handles disk failures. What would be an operational procedure in case of an OSD disk dying with this cookbook? Has anyone have experience about this? Thanks in advance!

mdsteveb commented 8 years ago

Me too, for this and other operational scenarios that people might have run across in real life.

For this case my guess is that you would let Ceph remove the OSD from the cluster, remove its entry from the node's osd list, replace the disk, then add a new entry for the new disk in the node's osd list. (Please correct me if I missed something!)

But, it would be nice to know if there was an easy way to do this without going through rebalancing (twice), for example.