grafana / terraform-provider-grafana

Terraform Grafana provider
https://www.terraform.io/docs/providers/grafana/
Mozilla Public License 2.0
436 stars 232 forks source link

[Bug]: provider fails after received 429 TooManyRequests #1874

Open xe-leon opened 3 weeks ago

xe-leon commented 3 weeks ago

Terraform Version

1.9.6

Terraform Grafana Provider Version

3.10.0

Grafana Version

Grafana Cloud (steady)

Affected Resource(s)

all Grafana onCall resources and data sources that use this client

Terraform Configuration Files

provider "grafana" {
  oncall_url          = "https://oncall-prod-eu-west-0.grafana.net/oncall"
  oncall_access_token = "oncall-api-key"
  retry_wait          = 60
  retries             = 15
}

Expected Behavior

With at least a few hundred resources, the provider handles API request limits normally, taking into account the retry_wait and retries options.

Actual Behavior

During terraform plan and apply, the following happens:

  1. Refreshing and reading data may have some retries: data.grafana_oncall_user.user[7]: Still reading... [2m10s elapsed]
  2. after around 3 minutes, reading fails after 6 attempts (ignoring my retry_wait = 60 and retries = 15 options):
    │ Error: GET https://oncall-prod-eu-west-0.grafana.net/oncall/api/v1/users/?username=*** giving up after 6 attempt(s)
    │ 
    │   with data.grafana_oncall_user.user[7],
    │   on main.tf line 52, in data "grafana_oncall_user" "user":
    │   52: data "grafana_oncall_user" "user" {
    │ 
│ Error: GET https://oncall-prod-eu-west-0.grafana.net/oncall/api/v1/escalation_policies/***/ giving up after 6 attempt(s)
│ 
│   with grafana_oncall_escalation.escalation,
│   on main.tf line 403, in resource "grafana_oncall_escalation" "escalation":
│  403: resource "grafana_oncall_escalation" "escalation" {

Steps to Reproduce

  1. Create terraform file with few hundreds of resources or data sources
  2. Run terraform plan which uses single API-token (you may need to run it a few times to face rate limits)

Important Factoids

I expect provider to take into account rate limits of grafana cloud

References

Duologic commented 1 week ago

all resources and data sources

Not all resources use the same API/client to connect, so this probably doesn't affect all resources. From the example it seems to affect OnCall resources, which uses this client: https://github.com/grafana/terraform-provider-grafana/blob/main/internal/resources/oncall/data_source_user.go#L7

That client might not properly take these wait/retry values into account.

xe-leon commented 1 week ago

That sounds correct, thank you! I've edited the issue.