replicatedhq / troubleshoot

Preflight Checks and Support Bundles Framework for Kubernetes Applications
https://troubleshoot.sh
Apache License 2.0
543 stars 92 forks source link

Troubleshoot logs collector requires a label value #1518

Closed bagel-dawg closed 2 months ago

bagel-dawg commented 5 months ago

Bug Description

The following support bundle spec returns no logs for a variety of pods that each have an app label with different values.

apiVersion: troubleshoot.sh/v1beta2
kind: SupportBundle
metadata:
  name: support-bundle
spec:
  collectors:
  - logs:
      collectorName: pod-logs
      selector:
      - app
      name: logs
      limits:
        maxAge: 24h

Expected Behavior

The support bundle generated should include any pod that has the label app.

Steps To Reproduce

Create a support bundle with the above spec and run the following command against a cluster with pods with an app label with any value:

support-bundle .github/build/support-bundle.yaml \
  --interactive=false \
  --output=artifacts/support-bundle.tar.gz \
  --request-timeout=5m

Additional Context

Additionally, sbctl should be able to interact with these logs in the same way that they can with the current implementation. I often use sbctl in CICD as a debugging tool and it can be painful to keep up with multiple services.

xavpaice commented 4 months ago

We need to verify this, and at the very least improve the docs at https://troubleshoot.sh/docs/collect/logs/#example-collector-definition which aren't clear and consistent for the examples.

xavpaice commented 4 months ago

From a Slack conversation with Ryan:

in Kubernetes you can query labels with a key=value pair or by the existence of a label (ie: a label with an arbitrary value). I want to be able to select for pods that have a particular label name, with any value. Example with kubectl:

# Select pods with a specific key=value pair
[~ (⎈|cloud-development:core)]$ kubectl get pods -l pod-template-hash=57f45f6d56
NAME                                     READY   STATUS    RESTARTS   AGE
app-webhook-agent-57f45f6d56-s5l6j   1/1     Running   0          13h

# Select pods that have this label key, but with any value
[~ (⎈|cloud-development:core)]$ kubectl get pods -l pod-template-hash
NAME                                    READY   STATUS    RESTARTS AGE
app2-6b9ddc6ff7-n4m4f              1/1     Running   0        46h
app-api-7bd689f6d8-qt6wg           1/1     Running   0        46h
app-webhook-agent-57f45f6d56-s5l6j  1/1     Running   0        13h

I'd like to get this use case written into the collector, plus improve the docs.

xavpaice commented 4 months ago

Let's also figure out if logs from this collector are available via sbctl, if not then we should raise a bug over at https://github.com/replicatedhq/sbctl/issues

nvanthao commented 4 months ago

Hi @bagel-dawg,

I can't reproduce the issue. Please refer to this record: https://asciinema.org/a/jR3xbmgGTnbTLpkw2DsXTajnQ The existing code should already cater for key-only selector

https://github.com/replicatedhq/troubleshoot/blob/af4cc8a08a369acf4328ff595f5d0148c64cc6a3/pkg/collect/logs.go#L124-L130

xavpaice commented 2 months ago

since we can't reproduce, I'll close this one.