Closed sj-williams closed 1 month ago
Completed the first draft of the user guide.
Next action items:
We can use below command to get the usage snapshot of the cluster in descending order.
# Sort by CPU usage
kubectl top pods --all-namespaces --no-headers | sort -k3 -nr
# Sort by Memory usage
kubectl top pods --all-namespaces --no-headers | sort -k4 -nr
monitoring
namespace consumed the most memory.
In user side, data-platform-app-prison-network-app-prod
consumed the most memory.
monitoring prometheus-prometheus-operator-kube-p-prometheus-0 3015m 173512Mi
monitoring prometheus-prometheus-operator-kube-p-prometheus-2 8412m 172600Mi
monitoring prometheus-prometheus-operator-kube-p-prometheus-1 2978m 170763Mi
data-platform-app-prison-network-app-prod data-platform-app-prison-network-app-prod-54bf644b-gnjnd 9m 7660Mi
ingress-controllers nginx-ingress-default-controller-5f98cd7f5-ltdlq 804m 6700Mi
ingress-controllers nginx-ingress-default-controller-5f98cd7f5-n2gkj 1148m 6658Mi
ingress-controllers nginx-ingress-default-controller-5f98cd7f5-nwzv2 1174m 6273Mi
ingress-controllers nginx-ingress-default-controller-5f98cd7f5-vdq24 1128m 6236Mi
ingress-controllers nginx-ingress-default-controller-5f98cd7f5-52fdx 1211m 6134Mi
ingress-controllers nginx-ingress-default-controller-5f98cd7f5-4nkjb 891m 6058Mi
Background
We need to have a clear and prominent entry in the User Guide which outlines some general considerations and guidelines for workload requests / limits.
This is so that we have reference in place to help ensure we don't invite resource intensive / non scalable monolithic workloads.
Also there are other things related to this that we should put in place / investigate to obtain a good picture of the general state of things in the cluster in this respect. Things to think about:
Proposed user journey
Approach
Which part of the user docs does this impact
Communicate changes
Questions / Assumptions
Definition of done
Reference
How to write good user stories