CiscoDevNet / Hyperflex-Hypercheck

Perform pro-active self checks on your Hyperflex cluster to ensure stability and resiliency
MIT License
27 stars 18 forks source link

"ZooKeeper Disk Usage" FAIL; root filesystem over 80% #5

Closed sttardy closed 4 years ago

sttardy commented 5 years ago

If root filesystem (/) is near full (84%) the alert is listed under the "ZooKeeper Disk Usage" check. This is confusing as the zookeeper usage was fine. Had to check the code to know what the output was telling me. IMO, to make it obvious what directory needs review / should be pulled out into another disk check "Root Disk Usage" instead of bundled within the "ZooKeeper Disk Usage".

avshukla commented 4 years ago

Thanks again Steven. we have implemented the requirements for each of /sda1, /var/stv and /var/zookeeper be less than 80%. If we see new alerts and blocks that should be troubleshooted appropriately as they could be possible deviations.

    elif "/sda1" in line:
        m1 = re.search(r"(\d+)%", line)
        if m1:
            if int(m1.group(1)) > 80:
                zdiskchk = "FAIL"
                zdisk = "/sda1"
                break
    elif "/var/stv" in line:
        m2 = re.search(r"(\d+)%", line)
        if m2:
            if int(m2.group(1)) > 80:
                zdiskchk = "FAIL"
                zdisk = "/var/stv"
                break
    elif "/var/zookeeper" in line:
        m3 = re.search(r"(\d+)%", line)
        if m3:
            if int(m3.group(1)) > 80:
                zdiskchk = "FAIL"
                zdisk = "/var/zookeeper"
                break