replicatedhq / troubleshoot

Preflight Checks and Support Bundles Framework for Kubernetes Applications
https://troubleshoot.sh
Apache License 2.0
545 stars 93 forks source link

feat(fio): add option to disable runtime #1601

Closed emosbaugh closed 2 months ago

emosbaugh commented 2 months ago

Description, Motivation and Context

Adding the option to disable the runtime option to fio (which is currently hardcoded to 120s) while maintaining backwards compatibility.

How I tested:

$ time sudo ./preflight spec.yaml --interactive=false -v10
I0822 21:36:59.772308    2879 loader.go:260] Loaded troubleshoot specs successfully
[filesystem-write-latency-etcd] Running collector...
I0822 21:36:59.772905    2879 host_filesystem_performance.go:439] collecting fio results: fio --name=fsperf --bs=2300 --directory=/var/lib/k0s/etcd --rw=write --ioengine=sync --fdatasync=1 --size=23068672 --output-format=json
I0822 21:39:16.230640    2879 result.go:108] Added "host-collectors/filesystemPerformance/filesystem-write-latency-etcd.json" to bundle output

   --- PASS Filesystem Write Latency
      --- Write latency is ok (p99 target < 10ms, actual: 9.240576ms)
--- PASS   ec-cluster-preflight
PASS

============ Collectors summary =============
Succeeded (S), eXcluded (X), Failed (F)
=============================================
filesystem-write-latency-etcd (S) : 136,458ms

============ Redactors summary =============
No redactors executed

============= Analyzers summary =============
Succeeded (S), eXcluded (X), Failed (F)
=============================================
Filesystem Write Latency (S) : 0ms

Duration: 136,461ms

real    2m16.525s
user    0m0.003s
sys 0m0.006s
$ cat spec.yaml
apiVersion: troubleshoot.sh/v1beta2
kind: HostPreflight
metadata:
  name: ec-cluster-preflight
spec:
  collectors:
    - filesystemPerformance:
        collectorName: filesystem-write-latency-etcd
        timeout: 5m
        directory: /var/lib/k0s/etcd
        fileSize: 22Mi
        operationSize: 2300
        datasync: true
        runTime: "0"
  analyzers:
    - filesystemPerformance:
        checkName: Filesystem Write Latency
        collectorName: filesystem-write-latency-etcd
        outcomes:
          - pass:
              when: "p99 < 10ms"
              message: 'Write latency is ok (p99 target < 10ms, actual: {{ .P99 }})'
          - fail:
              message: 'Write latency is high (p99 target < 10ms, actual: {{ .String }})'

Checklist

https://github.com/replicatedhq/troubleshoot.sh/pull/575

Does this PR introduce a breaking change?