CiscoDevNet / Hyperflex-Hypercheck

Perform pro-active self checks on your Hyperflex cluster to ensure stability and resiliency
MIT License
27 stars 18 forks source link

Add a new ControllerVM check to verify the smartctl for all the drives. #38

Open hsardana09 opened 3 years ago

hsardana09 commented 3 years ago

If STATUS is non-zero in the below code, then the drive is bad and needs to be replaced prior to upgrading to 4.5. This check needs to be run on all ctlvms in the cluster.

Note - the 'bad' drives need to be removed regardless of whether it is already hard-blacklisted and not in use as pre-upgrade validation does the same check and will fail during pre-upgrade validation.


for D in $(/bin/lsblk -dpn -e 1,2,7,11 | awk '{ print $1 }'); do
    echo $D | grep -q nvme
    if [ $? -eq 0 ]; then
          STATUS=$(/usr/sbin/nvme smart-log $D 2> /dev/null |  \
                   awk -F': ' '/critical_warning/ { print $NF }')
    else
            /usr/sbin/smartctl -q silent -H -i $D;
            STATUS=$?
     fi
     echo "$D: $STATUS";
done