We created this sensor to watch the status of an MDADM Software-RAID in Linux.
MDADM-Raids are very common in Linux environments especially Rootservers/Hosting evironments.
The sample output from mdadm (cat /proc/mdstat) of an healthy RAID looks like this:
Personalities : [raid1]
md0 : active raid1 sda1[2] sdb1[1]
194368 blocks super 1.2 [2/2] [UU]
md1 : active raid1 sdb5[1] sda5[0]
1950656 blocks super 1.2 [2/2] [UU]
md2 : active raid1 sdb6[1] sda6[0]
1462857536 blocks super 1.2 [2/2] [UU]
Each "md*" is an array with one or many drives. Each drive are marked with an "U".
The sensor keeps it simple: We split each array in a list and search for the keywords:
If an drive is missing or faulty the output for example looks like this:
md1 : active raid1 sde16 sdg1[1] sdb1[4] sdd1[3] sdc1[2]
488383936 blocks [6/4] [UUUU](In this example 2 of 7 drives are missing)
When finding an underscore we count the counter for arrays with missing drives up.
We are doing the same when finding keywords "recovery [recovering raid integrity on same drive], resyncing [drive replaced or additional drive added to expand array], check [integrity check, sometimes executed automatically from OS].
For example recovery looks like this:
md0 : active raid1 sda1[2] sdb1[3]
2095040 blocks super 1.2 [2/1] [U_]
[>....................] recovery = 0.4% (9700/2094040) finish=8.6min speed=19600K/sec
We created this sensor to watch the status of an MDADM Software-RAID in Linux. MDADM-Raids are very common in Linux environments especially Rootservers/Hosting evironments.
The sample output from mdadm (cat /proc/mdstat) of an healthy RAID looks like this:
Personalities : [raid1] md0 : active raid1 sda1[2] sdb1[1] 194368 blocks super 1.2 [2/2] [UU]
md1 : active raid1 sdb5[1] sda5[0] 1950656 blocks super 1.2 [2/2] [UU]
md2 : active raid1 sdb6[1] sda6[0] 1462857536 blocks super 1.2 [2/2] [UU]
Each "md*" is an array with one or many drives. Each drive are marked with an "U".
The sensor keeps it simple: We split each array in a list and search for the keywords:
If an drive is missing or faulty the output for example looks like this: md1 : active raid1 sde16 sdg1[1] sdb1[4] sdd1[3] sdc1[2] 488383936 blocks [6/4] [UUUU](In this example 2 of 7 drives are missing) When finding an underscore we count the counter for arrays with missing drives up.
We are doing the same when finding keywords "recovery [recovering raid integrity on same drive], resyncing [drive replaced or additional drive added to expand array], check [integrity check, sometimes executed automatically from OS].
For example recovery looks like this:
md0 : active raid1 sda1[2] sdb1[3] 2095040 blocks super 1.2 [2/1] [U_] [>....................] recovery = 0.4% (9700/2094040) finish=8.6min speed=19600K/sec