API calls that couldn't be completed because of a connection error are not counted. API calls that return API errors (or no errors) are counted. There are no comprehensive AWS documentation on request rate limits, but here are two resources on the topic, for reference:
During the design phase, we investigated whether implementing an http.RoundTripper would be a good solution. Ideally, we would have a common implementation for AWS SDK v1 and v2, since both methods use an http.Client under the hood. RoundTripper implementation proved to be infeasible, because of the following reasons:
Plugging in a RoundTripper to the client returned by AWSClient.HTTPClient() worked for AWS SDK v1 calls, but not for AWS SDK v2 calls.
AWS SDK v1 doesn't store service ID (EC2, IAM, etc.) and operation name (DescribeVPCs, etc.) in the request context, like AWS SDK v2 does. Therefore, we wouldn't be able to label v1 calls by service ID and operation name.
[ ] Run make reviewable to ensure this PR is ready for review.
[ ] Added backport release-x.y labels to auto-backport this PR if necessary.
I couldn't run make reviewable, because my local terraform setup is broken.
How has this code been tested
I've tested the code manually using the following resource configuration below, which contains resources that use AWS SDK v1 and v2, as of this writing. Because Upjet comes with Prometheus client, Upjet-based providers serve their metrics at :8080/metrics, by default. Here's a sample excerpt after applying the resource configuration:
# HELP upjet_resource_external_api_calls The number of external API calls.
# TYPE upjet_resource_external_api_calls counter
upjet_resource_external_api_calls{service="EC2",service_operation="AuthorizeSecurityGroupIngress"} 1
upjet_resource_external_api_calls{service="EC2",service_operation="CreateSecurityGroup"} 1
upjet_resource_external_api_calls{service="EC2",service_operation="CreateTags"} 1
upjet_resource_external_api_calls{service="EC2",service_operation="CreateVpc"} 1
upjet_resource_external_api_calls{service="EC2",service_operation="DescribeNetworkAcls"} 3
upjet_resource_external_api_calls{service="EC2",service_operation="DescribeRouteTables"} 3
upjet_resource_external_api_calls{service="EC2",service_operation="DescribeSecurityGroupRules"} 5
upjet_resource_external_api_calls{service="EC2",service_operation="DescribeSecurityGroups"} 11
upjet_resource_external_api_calls{service="EC2",service_operation="DescribeVpcAttribute"} 9
upjet_resource_external_api_calls{service="EC2",service_operation="DescribeVpcs"} 4
upjet_resource_external_api_calls{service="EC2",service_operation="RevokeSecurityGroupEgress"} 2
upjet_resource_external_api_calls{service="STS",service_operation="GetCallerIdentity"} 1
I manually cross-checked reported counts with the calls reported by CloudTrail Event History. Note that CloudTrail Event History may take up to a few minutes to show latest calls.
To test connection errors, I put breakpoints in the code, shut down my Internet connection upon hitting the breakpoint, and then resumed execution. To test API errors, I tried to delete a VPC that has a Security Group configured.
Description of your changes
This PR introduces three AWS API call counters:
API calls that couldn't be completed because of a connection error are not counted. API calls that return API errors (or no errors) are counted. There are no comprehensive AWS documentation on request rate limits, but here are two resources on the topic, for reference:
This PR also removes unsafe pointer operations, as described in https://github.com/upbound/terraform-provider-aws/pull/196.
Alternatives considered
During the design phase, we investigated whether implementing an http.RoundTripper would be a good solution. Ideally, we would have a common implementation for AWS SDK v1 and v2, since both methods use an http.Client under the hood. RoundTripper implementation proved to be infeasible, because of the following reasons:
Checklist
I have:
make reviewable
to ensure this PR is ready for review.backport release-x.y
labels to auto-backport this PR if necessary.I couldn't run
make reviewable
, because my local terraform setup is broken.How has this code been tested
I've tested the code manually using the following resource configuration below, which contains resources that use AWS SDK v1 and v2, as of this writing. Because Upjet comes with Prometheus client, Upjet-based providers serve their metrics at
:8080/metrics
, by default. Here's a sample excerpt after applying the resource configuration:I manually cross-checked reported counts with the calls reported by CloudTrail Event History. Note that CloudTrail Event History may take up to a few minutes to show latest calls.
To test connection errors, I put breakpoints in the code, shut down my Internet connection upon hitting the breakpoint, and then resumed execution. To test API errors, I tried to delete a VPC that has a Security Group configured.
Resource Configuration