NVIDIA / container-canary

A tool for testing and validating container requirements against versioned manifests
Apache License 2.0
237 stars 15 forks source link

Implement check retry thresholds and interval #17

Closed jacobtomlinson closed 2 years ago

jacobtomlinson commented 2 years ago

This PR factors the execution of the individual checks out into a dispatcher. This allowed cleaner implementation of the retry mechanisms. The configurable timeout, delay, thresholds and interval have now been implemented.

checks:
  - name: uid
    description: User ID is 1234
    probe:
      exec:
        command: [...]
      initialDelaySeconds: 0  # Delay after starting the container before the check should be run
      timeoutSeconds: 30  # Overall timeout for the check
      successThreshold: 1  # Number of times the check must pass before moving on
      failureThreshold: 1  # Number of times the check is allowed to fail before giving up
      periodSeconds: 1  # Interval between runs if threasholds are >1

One benefit is that now instead of having a flaky 10 second delay to wait for the container to start for the validation examples we can set the failure threshold to some high number and have the check retry every second. This resulted in the test suite going from 14 seconds to 4 seconds.