aws / karpenter-provider-aws

Karpenter is a Kubernetes Node Autoscaler built for flexibility, performance, and simplicity.
https://karpenter.sh
Apache License 2.0
6.15k stars 849 forks source link

chore: Add prometheus metrics for AWS client calls #6421

Closed jonathan-innis closed 4 days ago

jonathan-innis commented 5 days ago

Fixes #N/A

Description

Add Prometheus metrics for the AWS SDK Go. This used https://github.com/jonathan-innis/aws-sdk-go-prometheus to hydrate client-side metrics into the AWS SDK Go calls until the upstream aws-sdk-go or aws-sdk-go-v2 support metrics out-of-the box.

See: https://github.com/aws/aws-sdk-go-v2/issues/1744

# HELP aws_sdk_go_request_total The total number of AWS SDK Go requests
# TYPE aws_sdk_go_request_total counter
aws_sdk_go_request_total{action="CreateFleet",code="200",service="EC2"} 12
aws_sdk_go_request_total{action="CreateLaunchTemplate",code="200",service="EC2"} 48
aws_sdk_go_request_total{action="CreateTags",code="200",service="EC2"} 24
aws_sdk_go_request_total{action="DeleteLaunchTemplate",code="200",service="EC2"} 24
aws_sdk_go_request_total{action="DeleteMessage",code="200",service="SQS"} 89
aws_sdk_go_request_total{action="DescribeCluster",code="200",service="EKS"} 1
aws_sdk_go_request_total{action="DescribeImages",code="200",service="EC2"} 27
aws_sdk_go_request_total{action="DescribeInstanceTypeOfferings",code="200",service="EC2"} 3
aws_sdk_go_request_total{action="DescribeInstanceTypes",code="200",service="EC2"} 8
aws_sdk_go_request_total{action="DescribeInstanceTypes",code="412",service="EC2"} 1
aws_sdk_go_request_total{action="DescribeInstances",code="200",service="EC2"} 356
aws_sdk_go_request_total{action="DescribeInstances",code="400",service="EC2"} 13
aws_sdk_go_request_total{action="DescribeLaunchTemplates",code="200",service="EC2"} 1
aws_sdk_go_request_total{action="DescribeLaunchTemplates",code="400",service="EC2"} 48
aws_sdk_go_request_total{action="DescribeSecurityGroups",code="200",service="EC2"} 27
aws_sdk_go_request_total{action="DescribeSpotPriceHistory",code="200",service="EC2"} 3
aws_sdk_go_request_total{action="DescribeSubnets",code="200",service="EC2"} 27
aws_sdk_go_request_total{action="GetInstanceProfile",code="200",service="IAM"} 58
aws_sdk_go_request_total{action="GetParameter",code="200",service="SSM"} 108
aws_sdk_go_request_total{action="GetProducts",code="200",service="Pricing"} 9
aws_sdk_go_request_total{action="GetQueueUrl",code="200",service="SQS"} 1
aws_sdk_go_request_total{action="ReceiveMessage",code="200",service="SQS"} 133
aws_sdk_go_request_total{action="TerminateInstances",code="200",service="EC2"} 5
100 2624k    0 2624k    0     0  6147k      0 --:--:-- --:--:-- --:--:-- 6160k

How was this change tested?

Deploying Prometheus and the updated version with these prometheus metrics to my cluster and validating that they are returned on the prometheus metrics endpoint

Does this change impact docs?

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

netlify[bot] commented 5 days ago

Deploy Preview for karpenter-docs-prod canceled.

Name Link
Latest commit a5e211329de44443ea10b7fe2d9ac1020befbd8d
Latest deploy log https://app.netlify.com/sites/karpenter-docs-prod/deploys/667f581ef1cc140008a84b05
coveralls commented 5 days ago

Pull Request Test Coverage Report for Build 9703555317

Details


Changes Missing Coverage Covered Lines Changed/Added Lines %
pkg/operator/operator.go 0 2 0.0%
<!-- Total: 0 2 0.0% -->
Totals Coverage Status
Change from base Build 9691760501: 0.0%
Covered Lines: 5787
Relevant Lines: 7315

πŸ’› - Coveralls
coveralls commented 5 days ago

Pull Request Test Coverage Report for Build 9704275993

Details


Changes Missing Coverage Covered Lines Changed/Added Lines %
pkg/operator/operator.go 0 2 0.0%
<!-- Total: 0 2 0.0% -->
Files with Coverage Reduction New Missed Lines %
pkg/webhooks/webhooks.go 3 0.0%
<!-- Total: 3 -->
Totals Coverage Status
Change from base Build 9704271327: 0.01%
Covered Lines: 5785
Relevant Lines: 7307

πŸ’› - Coveralls
coveralls commented 5 days ago

Pull Request Test Coverage Report for Build 9704280294

Details


Changes Missing Coverage Covered Lines Changed/Added Lines %
pkg/operator/operator.go 0 2 0.0%
<!-- Total: 0 2 0.0% -->
Files with Coverage Reduction New Missed Lines %
pkg/webhooks/webhooks.go 3 0.0%
<!-- Total: 3 -->
Totals Coverage Status
Change from base Build 9704271327: 0.0%
Covered Lines: 5784
Relevant Lines: 7307

πŸ’› - Coveralls
coveralls commented 5 days ago

Pull Request Test Coverage Report for Build 9704318474

Details


Changes Missing Coverage Covered Lines Changed/Added Lines %
pkg/operator/operator.go 0 2 0.0%
<!-- Total: 0 2 0.0% -->
Totals Coverage Status
Change from base Build 9704271327: 0.01%
Covered Lines: 5785
Relevant Lines: 7307

πŸ’› - Coveralls
github-actions[bot] commented 5 days ago

Snapshot successfully published to oci://021119463062.dkr.ecr.us-east-1.amazonaws.com/karpenter/snapshot/karpenter:0-4fb047386122c42c021c401c3386b885c26def90. To install you must login to the ECR repo with an AWS account:

aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin 021119463062.dkr.ecr.us-east-1.amazonaws.com

helm upgrade --install karpenter oci://021119463062.dkr.ecr.us-east-1.amazonaws.com/karpenter/snapshot/karpenter --version "0-4fb047386122c42c021c401c3386b885c26def90" --namespace "kube-system" --create-namespace \
  --set "settings.clusterName=${CLUSTER_NAME}" \
  --set "settings.interruptionQueue=${CLUSTER_NAME}" \
  --set controller.resources.requests.cpu=1 \
  --set controller.resources.requests.memory=1Gi \
  --set controller.resources.limits.cpu=1 \
  --set controller.resources.limits.memory=1Gi \
  --wait
coveralls commented 4 days ago

Pull Request Test Coverage Report for Build 9720156565

Details


Changes Missing Coverage Covered Lines Changed/Added Lines %
pkg/operator/operator.go 0 3 0.0%
<!-- Total: 0 3 0.0% -->
Files with Coverage Reduction New Missed Lines %
pkg/operator/operator.go 7 9.26%
<!-- Total: 7 -->
Totals Coverage Status
Change from base Build 9720043516: 0.0%
Covered Lines: 5786
Relevant Lines: 7386

πŸ’› - Coveralls
coveralls commented 4 days ago

Pull Request Test Coverage Report for Build 9720163627

Details


Changes Missing Coverage Covered Lines Changed/Added Lines %
pkg/operator/operator.go 0 2 0.0%
<!-- Total: 0 2 0.0% -->
Files with Coverage Reduction New Missed Lines %
pkg/providers/amifamily/ami.go 1 90.56%
<!-- Total: 1 -->
Totals Coverage Status
Change from base Build 9720043516: -0.01%
Covered Lines: 5785
Relevant Lines: 7386

πŸ’› - Coveralls