replicatedhq / troubleshoot

Preflight Checks and Support Bundles Framework for Kubernetes Applications
https://troubleshoot.sh
Apache License 2.0
543 stars 92 forks source link

ClusterResource Analyzer Improvements #1523

Closed diamonwiggins closed 3 months ago

diamonwiggins commented 4 months ago

Describe the rationale for the suggested feature.

Documentation Improvements

Currently, the ClusterResource analyzer only has a single example which is based on a pvc. More examples with different types of objects could help those using the analyzer. Also, for the kind property it isn't documented which resources the ClusterResource analyzer supports. Lastly, for the resources that are supported, the naming isn't consistent. Some are lowercase and others are upper case emphasizing how important it is to have better documentation.

https://troubleshoot.sh/docs/analyze/cluster-resource/

Code Improvements

The naming for the resources shouldn't be case sensitive

sj-porter-knime commented 4 months ago

More examples in the docs would be very helpful!

The only thing we're planning to use clusterResource for so far is to verify that resources exist at all - I don't think there's a default way to do that (at least not a documented one), so we're doing weird things like...

    - clusterResource:
        checkName: knime-postgres-cluster
        kind: Service
        namespace: knime
        name: knime-postgres-cluster
        yamlPath: "metadata.name"
        regex: knime-postgres-cluster
        outcomes:
          - fail:
              when: "false"
              message: knime-postgres-cluster service is missing.
          - pass:
              when: "true"
              message: knime-postgres-cluster service is present.

...which essentially just checks that the .metadata.name property matches the resource.

The main goal is to simply ensure that expected resources are present in the cluster. In some cases, particularly if a helm chart fails to install altogether, the support bundle will not detect that a resource is missing from the cluster.

DexterYan commented 3 months ago

Hey @diamonwiggins,

Could you help me to review this PR #1547? Thx!