PRTG / PythonMiniProbe

MiniProbe for PRTG Network Monitor written in Python
BSD 3-Clause "New" or "Revised" License
73 stars 50 forks source link

Adding a sensor to monitor MDADM Softwareraid #36

Closed ghost closed 9 years ago

ghost commented 9 years ago

We created this sensor to watch the status of an MDADM Software-RAID in Linux. MDADM-Raids are very common in Linux environments especially Rootservers/Hosting evironments.

The sample output from mdadm (cat /proc/mdstat) of an healthy RAID looks like this:

Personalities : [raid1] md0 : active raid1 sda1[2] sdb1[1] 194368 blocks super 1.2 [2/2] [UU]

md1 : active raid1 sdb5[1] sda5[0] 1950656 blocks super 1.2 [2/2] [UU]

md2 : active raid1 sdb6[1] sda6[0] 1462857536 blocks super 1.2 [2/2] [UU]

Each "md*" is an array with one or many drives. Each drive are marked with an "U".

The sensor keeps it simple: We split each array in a list and search for the keywords:

If an drive is missing or faulty the output for example looks like this: md1 : active raid1 sde16 sdg1[1] sdb1[4] sdd1[3] sdc1[2] 488383936 blocks [6/4] [UUUU](In this example 2 of 7 drives are missing) When finding an underscore we count the counter for arrays with missing drives up.

We are doing the same when finding keywords "recovery [recovering raid integrity on same drive], resyncing [drive replaced or additional drive added to expand array], check [integrity check, sometimes executed automatically from OS].

For example recovery looks like this:

md0 : active raid1 sda1[2] sdb1[3] 2095040 blocks super 1.2 [2/1] [U_] [>....................] recovery = 0.4% (9700/2094040) finish=8.6min speed=19600K/sec