Closed karbyshevds closed 4 years ago
Data is already collected, example below GCP:
container_cpu_usage_seconds_total{beta_kubernetes_io_arch="amd64",beta_kubernetes_io_instance_type="n1-highcpu-8",beta_kubernetes_io_os="linux",cloud_google_com_gke_nodepool="training",cloud_google_com_gke_os_distribution="cos",cloud_google_com_gke_preemptible="true",container="step-setup",cpu="total",failure_domain_beta_kubernetes_io_region="us-east1",failure_domain_beta_kubernetes_io_zone="us-east1-b",id="/kubepods/burstable/pod36704787-dbb8-40bf-8c4a-778a755fb3a1/7d6ecf946cbebb4066ee2be8131fa7b60dd744fe13384cb0ed224c37be12042b",image="gcr.io/or2-msq-epmd-legn-t1iylu/odahu/odahu-flow-model-trainer@sha256:0a88f9889ac6e59fc55c837e6928fb21a4ae4eef42c7ef8f54b4bf8c387349e6",instance="gke-gke-dev02-training-0c10360b-6q4n",job="kubernetes-cadvisor",kubernetes_io_arch="amd64",kubernetes_io_hostname="gke-gke-dev02-training-0c10360b-6q4n",kubernetes_io_os="linux",mode="odahu-flow-training",name="k8s_step-setup_wine-mlflow-not-default-pod-h77pr_odahu-flow-training_36704787-dbb8-40bf-8c4a-778a755fb3a1_0",namespace="odahu-flow-training",pod="wine-mlflow-not-default-pod-h77pr",project="odahu-flow"}
beta_kubernetes_io_instance_type - node type cloud_google_com_gke_nodepool - node pool name cloud_google_com_gke_preemptible - preemptble status instance = instance name name - pod name
As DevOps engineer I want to collect resources consumed by ODAHU containers (training, deployment, packaging). At least it should be CPU and RAM.