xcat2 / xcat-extensions

Repos to store scripts for special user cases
4 stars 8 forks source link

raid1_rh.sh: Make md RAID creation fail proof (fixes #40) #41

Closed Obihoernchen closed 6 years ago

Obihoernchen commented 6 years ago

See #40 for more information.

Obihoernchen commented 6 years ago

Hey @neo954 please have a look at my latest commits (especially 0e415d6).

  1. Search for disk1 (ex: sda) and disk2 (ex. sdb) in /proc/mdstat to find all md RAIDs on these disks
  2. Stop all md RAIDs found in 1.
  3. Finally, the superblock zeroing is done for all partitions on disk1 and disk2 only

Works like this (I've just added an echo for each mdadm command for the output):

[rh75-compute-install.partition] get a disk: sdb
[rh75-compute-install.partition] get a disk: sda
[rh75-compute-install.partition] disk sdb has wwn:0x5000c500aadde735
[rh75-compute-install.partition] disk sda has wwn:0x5000c500a8f628fc
[rh75-compute-install.partition] get disk: (sda sdb) with sort_type:wwn
[rh75-compute-install.partition] the final disk order:
[rh75-compute-install.partition]       sda | /dev/disk/by-id/ata-ST1000NX0313_00LY266_00LY265IBM_W4708JNJ /dev/disk/by-id/wwn-0x5000c500a8f628fc /dev/disk/by-path/pci-0004:03:00.0-ata-1.0
[rh75-compute-install.partition]       sdb | /dev/disk/by-id/ata-ST1000NX0313_00LY266_00LY265IBM_W470V7KY /dev/disk/by-id/wwn-0x5000c500aadde735 /dev/disk/by-path/pci-0004:03:00.0-ata-2.0
[rh75-compute-install.partition] the output file is: /tmp/xcat_sorted_disks
[rh75-compute-install.partition] disabling md RAID resync during installation
[rh75-compute-install.partition] stopping md device: /dev/md/0
mdadm --stop /dev/md/0
[rh75-compute-install.partition] stopping md device: /dev/md/1
mdadm --stop /dev/md/1
[rh75-compute-install.partition] stopping md device: /dev/md/2
mdadm --stop /dev/md/2
[rh75-compute-install.partition] stopping md device: /dev/md/3
mdadm --stop /dev/md/3
[rh75-compute-install.partition] stopping md device: /dev/md/4
mdadm --stop /dev/md/4
[rh75-compute-install.partition] zeroing superblocks of /dev/disk/by-id/ata-ST1000NX0313_00LY266_00LY265IBM_W4708JNJ-part*
mdadm --zero-superblock /dev/disk/by-id/ata-ST1000NX0313_00LY266_00LY265IBM_W4708JNJ-part1 /dev/disk/by-id/ata-ST1000NX0313_00LY266_00LY265IBM_W4708JNJ-part2 /dev/disk/by-id/ata-ST1000NX0313_00LY266_00LY265IBM_W4708JNJ-part3 /dev/disk/by-id/ata-ST1000NX0313_00LY266_00LY265IBM_W4708JNJ-part4 /dev/disk/by-id/ata-ST1000NX0313_00LY266_00LY265IBM_W4708JNJ-part5 /dev/disk/by-id/ata-ST1000NX0313_00LY266_00LY265IBM_W4708JNJ-part6 /dev/disk/by-id/ata-ST1000NX0313_00LY266_00LY265IBM_W4708JNJ-part7
[rh75-compute-install.partition] zeroing superblocks of /dev/disk/by-id/ata-ST1000NX0313_00LY266_00LY265IBM_W470V7KY-part*
mdadm --zero-superblock /dev/disk/by-id/ata-ST1000NX0313_00LY266_00LY265IBM_W470V7KY-part1 /dev/disk/by-id/ata-ST1000NX0313_00LY266_00LY265IBM_W470V7KY-part2 /dev/disk/by-id/ata-ST1000NX0313_00LY266_00LY265IBM_W470V7KY-part3 /dev/disk/by-id/ata-ST1000NX0313_00LY266_00LY265IBM_W470V7KY-part4 /dev/disk/by-id/ata-ST1000NX0313_00LY266_00LY265IBM_W470V7KY-part5 /dev/disk/by-id/ata-ST1000NX0313_00LY266_00LY265IBM_W470V7KY-part6 /dev/disk/by-id/ata-ST1000NX0313_00LY266_00LY265IBM_W470V7KY-part7
immarvin commented 6 years ago

hi @neo954 , thanks for your comment, this partition script is just an example/template for admin to create a raid-1 setup on 2 disks, which is far from a common solution for raid setup, maybe we can put some effort to enhance it in the future, but this PR is adequate to for the issue to fix.

After discuss with @neo954 , we would like to merge this, thanks for your contribution @Obihoernchen

Obihoernchen commented 6 years ago

Yes this script is basically only useful for stateful compute nodes or basic service nodes. Raid 5, Raid 10 etc. need some more attention ;)

Obihoernchen commented 5 years ago

/proc/mdstat is not fully populated during script execution :/ Therefore, this still fails if there is an incomplete md setup existing on the disk.

Unfortunately, checking /proc/mdstat does not work.