fedora-iot / greenboot

Generic Health Checking Framework for systemd
GNU Lesser General Public License v2.1
101 stars 29 forks source link

greenboot: add feature to disable healthchecks #137

Closed djach7 closed 7 months ago

djach7 commented 7 months ago

Addresses #119. Allows users to specify healthchecks to be skipped via a DISABLED_HEALTHCHECKS variable in greenboot.conf. Skipped healthchecks will be reflected with an appropriate message in the logs.

miabbott commented 7 months ago

I think this change also needs some validation that the script name(s) specified in DISABLED_HEALTHCHECKS actually exist in the different directories that are searched.

Some psuedo-valid bash code:

DISABLED_HEALTHCHECKS=("01_repository_dns_check.sh 02_watchdog.sh 03_this_script_doesnt_exist.sh")

SCRIPTS_DIRS=("/usr/lib/greenboot/check/required.d"
              "/usr/lib/greenboot/check/wanted.d"
              "/usr/lib/greenboot/green.d"
              "/usr/lib/greenboot/red.d"
              "/etc/greenboot/check/required.d"
              "/etc/greenboot/check/wanted.d"
              "/etc/greenboot/green.d"
              "/etc/greenboot/red.d")

for DISABLED_HEALTH_CHECK in "${DISABLED_HEALTH_CHECKS[@]}"; do
    for SCRIPT_DIR in "${SCRIPTS_DIRS[@]}"; do
        if [ -f "${SCRIPT_DIR}/${DISABLED_HEALTH_CHECK}" ]; then
            echo "DEBUG: Found script; will disable it"
            break
        else
            echo "WARNING: Could not find script named ${DISABLED_HEALTH_CHECK} in any expected location"
        fi
    done
    <do other things with disabling the script>
done
mmartinv commented 7 months ago

I think this change also needs some validation that the script name(s) specified in DISABLED_HEALTHCHECKS actually exist in the different directories that are searched.

Some psuedo-valid bash code:

DISABLED_HEALTHCHECKS=("01_repository_dns_check.sh 02_watchdog.sh 03_this_script_doesnt_exist.sh")

SCRIPTS_DIRS=("/usr/lib/greenboot/check/required.d"
              "/usr/lib/greenboot/check/wanted.d"
              "/usr/lib/greenboot/green.d"
              "/usr/lib/greenboot/red.d"
              "/etc/greenboot/check/required.d"
              "/etc/greenboot/check/wanted.d"
              "/etc/greenboot/green.d"
              "/etc/greenboot/red.d")

for DISABLED_HEALTH_CHECK in "${DISABLED_HEALTH_CHECKS[@]}"; do
    for SCRIPT_DIR in "${SCRIPTS_DIRS[@]}"; do
        if [ -f "${SCRIPT_DIR}/${DISABLED_HEALTH_CHECK}" ]; then
            echo "DEBUG: Found script; will disable it"
            break
        else
            echo "WARNING: Could not find script named ${DISABLED_HEALTH_CHECK} in any expected location"
        fi
    done
    <do other things with disabling the script>
done

Actually I don't think it's needed because the function is called with existing scripts only

miabbott commented 7 months ago

Actually I don't think it's needed because the function is called with existing scripts only

Smart!

miabbott commented 7 months ago

Actually I don't think it's needed because the function is called with existing scripts only

I still think there is a chance of a typo when configuring DISABLED_HEALTHCHECKS that we should be able to warn users about, right?

Even if the function is called on an existing script (i.e. 01_repository_dns_check.sh), if I have defined DISABLED_HEALTHCHECKS=("01_repostory_dns_checks.sh"), that script will still run because I used the wrong script name.

djach7 commented 7 months ago

@miabbott I added a line in the config comments about needing exact spelling. If you'd like me to go beyond that and write something more please let me know

say-paul commented 7 months ago

Apart from the config format, the changes looks good to me.

miabbott commented 7 months ago

@miabbott I added a line in the config comments about needing exact spelling. If you'd like me to go beyond that and write something more please let me know

It's a good first step, but I still think we should be able to warn the user if something in DISABLED_HEALTHCHECKS can't be found/doesn't exist. I don't want to block further progress on this implementation, so maybe you can do that kind of warning in a follow-up PR.