Open openstack-test opened 1 year ago
crane版本:helm scheduler-0.2.2 k8s版本:1.24
使用一台k8s节点,规格16核32GB 节点Annotations负载
Annotations: alpha.kubernetes.io/provided-node-ip: 172.30.64.34 cpu_usage_avg_5m: 0.63012,2023-10-17T15:04:32Z cpu_usage_max_avg_1d: 0.63666,2023-10-17T14:03:36Z cpu_usage_max_avg_1h: 0.63654,2023-10-17T15:01:29Z mem_usage_avg_5m: 0.21519,2023-10-17T15:04:34Z mem_usage_max_avg_1d: 0.21614,2023-10-17T14:02:41Z mem_usage_max_avg_1h: 0.21700,2023-10-17T15:01:53Z node.alpha.kubernetes.io/ttl: 0 node_hot_value: 0,2023-10-17T15:04:34Z
节点Requests
Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) Resource Requests Limits -------- -------- ------ cpu 15367m (96%) 1770m (11%) memory 25943Mi (91%) 1500Mi (5%) ephemeral-storage 0 (0%) 0 (0%) hugepages-1Gi 0 (0%) 0 (0%) hugepages-2Mi 0 (0%) 0 (0%) attachable-volumes-aws-ebs 0 0
创建一个测试服务,共有8个pod副本,每个pod压测产生2核1GB负载,requests为3核5GB
apiVersion: apps/v1 kind: Deployment metadata: name: demo-nginx namespace: demo labels: app: demo-nginx spec: replicas: 8 selector: matchLabels: app: demo-nginx template: metadata: labels: app: demo-nginx spec: schedulerName: crane-scheduler containers: - name: demo-nginx image: xxxxxx/stress:latest command: ["stress", "-c", "1","--vm", "1", "--vm-bytes", "1G", "--vm-keep"] ports: - containerPort: 80 resources: requests: cpu: 3 memory: 5Gi
实际只运行了5个,实际应该运行至少6个Pod
$ kgp -A -o wide|grep demo-nginx demo demo-nginx-69db9d45df-4j2rh 1/1 Running 0 3h20m 172.30.64.199 ip-172-30-64-34.ap-northeast-1.compute.internal <none> <none> demo demo-nginx-69db9d45df-4jc5h 0/1 Pending 0 3h20m <none> <none> <none> <none> demo demo-nginx-69db9d45df-6p4jz 0/1 Pending 0 3h20m <none> <none> <none> <none> demo demo-nginx-69db9d45df-7fdn2 1/1 Running 0 3h20m 172.30.64.111 ip-172-30-64-34.ap-northeast-1.compute.internal <none> <none> demo demo-nginx-69db9d45df-b75mz 1/1 Running 0 3h20m 172.30.64.78 ip-172-30-64-34.ap-northeast-1.compute.internal <none> <none> demo demo-nginx-69db9d45df-vsp6g 1/1 Running 0 3h20m 172.30.64.97 ip-172-30-64-34.ap-northeast-1.compute.internal <none> <none> demo demo-nginx-69db9d45df-xxrsb 1/1 Running 0 3h20m 172.30.64.10 ip-172-30-64-34.ap-northeast-1.compute.internal <none> <none> demo demo-nginx-69db9d45df-zgkjr 0/1 Pending 0 8m56s <none> <none> <none> <none>
predicate配置
$ k get cm dynamic-scheduler-policy -n crane-system -o yaml apiVersion: v1 data: policy.yaml: | apiVersion: scheduler.policy.crane.io/v1alpha1 kind: DynamicSchedulerPolicy spec: syncPolicy: ##cpu usage - name: cpu_usage_avg_5m period: 3m - name: cpu_usage_max_avg_1h period: 15m - name: cpu_usage_max_avg_1d period: 3h ##memory usage - name: mem_usage_avg_5m period: 3m - name: mem_usage_max_avg_1h period: 15m - name: mem_usage_max_avg_1d period: 3h predicate: ##cpu usage - name: cpu_usage_avg_5m maxLimitPecent: 0.90 - name: cpu_usage_max_avg_1h maxLimitPecent: 0.95 ##memory usage - name: mem_usage_avg_5m maxLimitPecent: 0.90 - name: mem_usage_max_avg_1h maxLimitPecent: 0.95 priority: ###score = sum(() * weight) / len, 0 <= score <= 10 ##cpu usage - name: cpu_usage_avg_5m weight: 0.2 - name: cpu_usage_max_avg_1h weight: 0.3 - name: cpu_usage_max_avg_1d weight: 0.5 ##memory usage - name: mem_usage_avg_5m weight: 0.2 - name: mem_usage_max_avg_1h weight: 0.3 - name: mem_usage_max_avg_1d weight: 0.5 hotValue: - timeRange: 5m count: 20 - timeRange: 1m count: 10
crane-scheduler是根据节点实际负载调度Pod,为什么节点内存负载是0.21,CPU负载是0.63,且未触发predicate指标阀值,实际只运行了5个pod,按照节点剩余25GB((1-0.21)32)内存、5核CPU((1-0.63)16)资源计算,至少可运行6个以上Pod
intree 策略会算 request 是否满足, 3(request)x6(replicas) = 18 , 你是 16c 的 Node
crane版本:helm scheduler-0.2.2 k8s版本:1.24
使用一台k8s节点,规格16核32GB 节点Annotations负载
节点Requests
创建一个测试服务,共有8个pod副本,每个pod压测产生2核1GB负载,requests为3核5GB
实际只运行了5个,实际应该运行至少6个Pod
predicate配置
crane-scheduler是根据节点实际负载调度Pod,为什么节点内存负载是0.21,CPU负载是0.63,且未触发predicate指标阀值,实际只运行了5个pod,按照节点剩余25GB((1-0.21)32)内存、5核CPU((1-0.63)16)资源计算,至少可运行6个以上Pod