[RFE] Collect additional information about all pods in the cluster

oarribas commented 1 year ago

Collect similar information than the information shown by oc describe nodes, like CPU/memory limits and requests per pod, allocated resources in the nodes, real resource usage by pods (maybe also by containers) in must-gather.

Currently, information like the following is shown in an oc describe nodes output:

[...]
Non-terminated Pods:                         (100 in total)
  Namespace                                  Name                                                          CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                                  ----                                                          ------------  ----------  ---------------  -------------  ---
  axyz                                        daemonset-example-d4lvf                                       0 (0%)        0 (0%)      0 (0%)           0 (0%)         29h
  axyz                                        deployment-example-862f9x79n-z27bg                            0 (0%)        0 (0%)      0 (0%)           0 (0%)         12h
  axyz                                        job-example-mxg74                                             0 (0%)        0 (0%)      0 (0%)           0 (0%)         12h
  axyz                                        replication-example-vhz2v                                     0 (0%)        0 (0%)      0 (0%)           0 (0%)         12h
  axyz                                        replication-example-yxz4c                                     0 (0%)        0 (0%)      0 (0%)           0 (0%)         12h
  axyz                                        replication-example-njcv2                                     0 (0%)        0 (0%)      0 (0%)           0 (0%)         12h
  zyxw                                      backend-worker-1-b2bl4                                        150m (4%)     1 (28%)     50Mi (0%)        300Mi (4%)     7h2m
  zyxw                                      system-sphinx-1-xnvz1                                         80m (2%)      1 (28%)     250Mi (3%)       512Mi (7%)     12h
[...]

Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests          Limits
  --------           --------          ------
  cpu                3364m (96%)       9780m (279%)
  memory             7118172800 (99%)  9694Mi (141%)
  ephemeral-storage  0 (0%)            0 (0%)
  hugepages-1Gi      0 (0%)            0 (0%)
  hugepages-2Mi      0 (0%)            0 (0%)

But in a must-gather, not all namespaces/pods are collected.

That information could help to identify overcommitted nodes, pods without requests/limits, etc.

palonsoro commented 1 year ago

A good way to do this would be to parse oc get pod --all-namespaces -o json with jq to produce a reduced JSON summary (kudos to @gmeghnag for the idea).

Not sure if jq is included in must-gather image already, but it is included in OCP4 repos, so it shouldn't be a problem.

gmeghnag commented 1 year ago

Something like the following (need to be tested to check if it is valid):

$ oc get pods -A --field-selector="status.phase!=Succeeded" -o json | jq '[.items[]| {"name": .metadata.name, node: .spec.nodeName, resources: .spec.containers[].resources}]'

an example:

oc get pods -A --field-selector="status.phase!=Succeeded" -o json | jq '[.items[]| {"name": .metadata.name, node: .spec.nodeName, resources: .spec.containers[].resources}]'
[
  {
    "name": "openshift-apiserver-operator-85bc4dfdb4-zj6xn",
    "node": "ip-10-0-215-218.eu-central-1.compute.internal",
    "resources": {
      "requests": {
        "cpu": "10m",
        "memory": "50Mi"
      }
    }
  },
  {
    "name": "apiserver-6f8b7d589f-69kt4",
    "node": "ip-10-0-131-238.eu-central-1.compute.internal",
    "resources": {
      "requests": {
        "cpu": "100m",
        "memory": "200Mi"
      }
    }
  },
  ...

Or if we want the same output by node name, something like the following:

NODE=<NODE_NAME>
oc get pods -A --field-selector="status.phase!=Succeeded" -o json | jq --arg NODE "$NODE" '[.items[]| select(.spec.nodeName==$NODE) | {name: .metadata.name, node: .spec.nodeName, resources: .spec.containers[].resources}]'

palonsoro commented 1 year ago

You would like to have the namespace in that output. It is perfectly possible to have more than one pod with the same name, specially if they come from statefulsets or are created by some custom controller (or by hand)

palonsoro commented 1 year ago

For the rest, it looks fine.

palonsoro commented 1 year ago

I'd also suggest using -c option of jq to produce compact output and not wrap inside an array. That way, one can both use jq and grep on the results (this is what the audit logs do, for reference).

gmeghnag commented 1 year ago

I updated the query to display also the containerName:

oc get pods -A --field-selector="status.phase!=Succeeded" -o json | jq --arg NODE "$NODE" '.items[]| select(.spec.nodeName==$NODE) | . as $pod | .spec.containers[] | {node: $pod.spec.nodeName, namespace: $pod.metadata.namespace,  podName: $pod.metadata.name, containerName: .name, resources: .resources}' -c

an example:

oc get pods -A --field-selector="status.phase!=Succeeded" -o json | jq --arg NODE "$NODE" '.items[]| select(.spec.nodeName==$NODE) | . as $pod | .spec.containers[] | {node: $pod.spec.nodeName, namespace: $pod.metadata.namespace,  podName: $pod.metadata.name, containerName: .name, resources: .resources}' -c | head -5
{"node":"ip-10-0-131-238.eu-central-1.compute.internal","namespace":"openshift-apiserver","podName":"apiserver-6f8b7d589f-69kt4","containerName":"openshift-apiserver","resources":{"requests":{"cpu":"100m","memory":"200Mi"}}}
{"node":"ip-10-0-131-238.eu-central-1.compute.internal","namespace":"openshift-apiserver","podName":"apiserver-6f8b7d589f-69kt4","containerName":"openshift-apiserver-check-endpoints","resources":{"requests":{"cpu":"10m","memory":"50Mi"}}}
{"node":"ip-10-0-131-238.eu-central-1.compute.internal","namespace":"openshift-authentication","podName":"oauth-openshift-59795457bf-sbg4n","containerName":"oauth-openshift","resources":{"requests":{"cpu":"10m","memory":"50Mi"}}}
{"node":"ip-10-0-131-238.eu-central-1.compute.internal","namespace":"openshift-cluster-csi-drivers","podName":"aws-ebs-csi-driver-controller-676777c46f-2cqn5","containerName":"csi-driver","resources":{"requests":{"cpu":"10m","memory":"50Mi"}}}
{"node":"ip-10-0-131-238.eu-central-1.compute.internal","namespace":"openshift-cluster-csi-drivers","podName":"aws-ebs-csi-driver-controller-676777c46f-2cqn5","containerName":"driver-kube-rbac-proxy","resources":{"requests":{"cpu":"10m","memory":"20Mi"}}}

soltysh commented 1 year ago

But in a must-gather, not all namespaces/pods are collected.

This is intentional, we are only focusing on control-plane related data which is required to diagnose the cluster state and help our customers resolve the problem. Also, collecting any kind of data spanning all namespaces would risk exposing various Personal Identifiable Information which we would be required to remove from the collected data set, which isn't a trivial task to undertake. Lastly, every data we scrape increase the overall size of the archive, which when working in a cluster with a few nodes isn't that big of a deal, but when you reach clusters with hundreds, or thousands of nodes, the extra few bytes make a significant difference. This forces us to justify any addition in terms of balance between how much data we have to gather every time vs what data we can request in followup engagements with our customers.

That information could help to identify overcommitted nodes, pods without requests/limits, etc.

That is valid use-case, but with the current capabilities OpenShift has, that kind of information would be much better suited to be expose in OpenShift Insights, based on cluster metrics and suggest any actions a user might take to help them improve the stability and availability of their cluster.

Based on the above, as well as other information presented in this issue, I'm closing this as won't fix.

/close

openshift-ci[bot] commented 1 year ago

@soltysh: Closing this issue.

In response to [this](https://github.com/openshift/must-gather/issues/339#issuecomment-1346615958): >> But in a must-gather, not all namespaces/pods are collected. > >This is intentional, we are only focusing on control-plane related data which is required to diagnose the cluster state and help our customers resolve the problem. Also, collecting any kind of data spanning all namespaces would risk exposing various Personal Identifiable Information which we would be required to remove from the collected data set, which isn't a trivial task to undertake. >Lastly, every data we scrape increase the overall size of the archive, which when working in a cluster with a few nodes isn't that big of a deal, but when you reach clusters with hundreds, or thousands of nodes, the extra few bytes make a significant difference. This forces us to justify any addition in terms of balance between how much data we have to gather every time vs what data we can request in followup engagements with our customers. > >> That information could help to identify overcommitted nodes, pods without requests/limits, etc. > >That is valid use-case, but with the current capabilities OpenShift has, that kind of information would be much better suited to be expose in OpenShift Insights, based on cluster metrics and suggest any actions a user might take to help them improve the stability and availability of their cluster. > >Based on the above, as well as other information presented in this issue, I'm closing this as won't fix. > >/close > Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.

openshift / must-gather

[RFE] Collect additional information about all pods in the cluster #339