microsoft / terraform-provider-azuredevops

Terraform Azure DevOps provider
https://www.terraform.io/docs/providers/azuredevops/
MIT License
379 stars 268 forks source link

azuredevops_serviceendpoint_azurerm - ReadResource request was cancelled - segfault #951

Open arne21a opened 8 months ago

arne21a commented 8 months ago

Community Note

Terraform (and Azure DevOps Provider) Version

Terraform v1.4.6 on linux_arm64 Provider versions: 0.1.0, 0.10.0, 0.11.0

Affected Resource(s)

Terraform Configuration Files

Sorry, our config is not easily shareable, this is at leat the part calling the crashing code. The full implementation of this module can be found here: https://github.com/arne21a/caf-terraform-landingzones/tree/feature/devops-upstream-contribution/caf_solution/add-ons/azure_devops_vmss

resource "azuredevops_serviceendpoint_azurerm" "azure" {
  depends_on = [module.caf]
  for_each   = try(var.azure_devops.service_endpoints, {})

  project_id            = data.azuredevops_project.project.id
  service_endpoint_name = each.value.endpoint_name

  credentials {
    serviceprincipalid  = local.combined.aad_apps[try(each.value.aad_app.lz_key, var.landingzone.key)][each.value.aad_app.key].azuread_application.application_id
    serviceprincipalkey = data.azurerm_key_vault_secret.client_secret[each.key].value
  }

  azurerm_spn_tenantid      = local.combined.aad_apps[try(each.value.aad_app.lz_key, var.landingzone.key)][each.value.aad_app.key].tenant_id
  azurerm_subscription_id   = try(each.value.subscription.id, data.azurerm_subscriptions.available[each.key].subscriptions[0].subscription_id)
  azurerm_subscription_name = each.value.subscription.name
}

Debug Output

...
2024-01-17T11:09:45.402Z [DEBUG] ReferenceTransformer: "azuredevops_serviceendpoint_azurerm.azure[\"caf-foundation\"]" references: []
azuredevops_serviceendpoint_azurerm.azure["caf-foundation"]: Refreshing state... [id=63443445-2a27-4a52-8bbd-256af80d1b96]
2024-01-17T11:09:46.076Z [ERROR] plugin.(*GRPCProvider).ReadResource: error="rpc error: code = Unavailable desc = error reading from server: EOF"
2024-01-17T11:09:46.076Z [ERROR] vertex "azuredevops_serviceendpoint_azurerm.azure[\"caf-foundation\"]" error: Plugin did not respond
2024-01-17T11:09:46.077Z [ERROR] vertex "azuredevops_serviceendpoint_azurerm.azure (expand)" error: Plugin did not respond
2024-01-17T11:09:46.162Z [INFO]  backend/local: plan operation completed

Panic Output

ā”‚ Error: Plugin did not respond
ā”‚ 
ā”‚   with azuredevops_serviceendpoint_azurerm.azure["caf-foundation"],
ā”‚   on azure-devops.tf line 29, in resource "azuredevops_serviceendpoint_azurerm" "azure":
ā”‚   29: resource "azuredevops_serviceendpoint_azurerm" "azure" {
ā”‚ 
ā”‚ The plugin encountered an error, and failed to respond to the plugin.(*GRPCProvider).ReadResource call. The plugin logs may contain more details.
ā•µ

Stack trace from the terraform-provider-azuredevops_v0.10.0 plugin:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x780eb0]

goroutine 26 [running]:
github.com/microsoft/terraform-provider-azuredevops/azuredevops/internal/service/serviceendpoint.resourceServiceEndpointAzureRMRead(0x0?, {0x8343e0?, 0x40006fe6e0})
        github.com/microsoft/terraform-provider-azuredevops/azuredevops/internal/service/serviceendpoint/resource_serviceendpoint_azurerm.go:189 +0x80
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*Resource).read(0xb2f480?, {0xb2f480?, 0x4000220ed0?}, 0xd?, {0x8343e0?, 0x40006fe6e0?})
        github.com/hashicorp/terraform-plugin-sdk/v2@v2.23.0/helper/schema/resource.go:712 +0x138
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*Resource).RefreshWithoutUpgrade(0x400047ca80, {0xb2f480, 0x4000220ed0}, 0x4000484c30, {0x8343e0, 0x40006fe6e0})
        github.com/hashicorp/terraform-plugin-sdk/v2@v2.23.0/helper/schema/resource.go:1015 +0x464
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*GRPCProviderServer).ReadResource(0x4000446588, {0xb2f480?, 0x4000220db0?}, 0x400009c900)
        github.com/hashicorp/terraform-plugin-sdk/v2@v2.23.0/helper/schema/grpc_provider.go:613 +0x408
github.com/hashicorp/terraform-plugin-go/tfprotov5/tf5server.(*server).ReadResource(0x400029cfa0, {0xb2f480?, 0x40002204e0?}, 0x400069a2a0)
        github.com/hashicorp/terraform-plugin-go@v0.14.0/tfprotov5/tf5server/server.go:748 +0x3d4
github.com/hashicorp/terraform-plugin-go/tfprotov5/internal/tfplugin5._Provider_ReadResource_Handler({0x995020?, 0x400029cfa0}, {0xb2f480, 0x40002204e0}, 0x40001ac070, 0x0)
        github.com/hashicorp/terraform-plugin-go@v0.14.0/tfprotov5/internal/tfplugin5/tfplugin5_grpc.pb.go:349 +0x170
google.golang.org/grpc.(*Server).processUnaryRPC(0x40002b61e0, {0xb32140, 0x40003229c0}, 0x40002fc000, 0x4000491b30, 0x1064790, 0x0)
        google.golang.org/grpc@v1.56.3/server.go:1335 +0xc68
google.golang.org/grpc.(*Server).handleStream(0x40002b61e0, {0xb32140, 0x40003229c0}, 0x40002fc000, 0x0)
        google.golang.org/grpc@v1.56.3/server.go:1712 +0x854
google.golang.org/grpc.(*Server).serveStreams.func1.1()
        google.golang.org/grpc@v1.56.3/server.go:947 +0xb4
created by google.golang.org/grpc.(*Server).serveStreams.func1
        google.golang.org/grpc@v1.56.3/server.go:958 +0x184

Error: The terraform-provider-azuredevops_v0.10.0 plugin crashed!

This is always indicative of a bug within the plugin. It would be immensely
helpful if you could report the crash with the plugin's maintainers so that it
can be fixed. The output above should help diagnose the issue.

Expected Behavior

No segfault

Actual Behavior

The exact same config was working for years with version 0.1.0 of this provider. There is a high chance that an external change on the resource triggers this behaviour. Also tested with 0.10.0 and 0.11.0

Steps to Reproduce

  1. terraform plan

Important Factoids

References

arne21a commented 8 months ago

Afer finding 918 i checked for the existence of the service connection in question. It is indeed gone. So thats the external trigger i mentioned earlier. I am not sure why it is gone, might be part of the issue or the result of an unrelated problem.

Here is the log using version 0.11.0 which should, in my understanding, contain the fix for 918

2024-01-17T11:30:19.829Z [DEBUG] ReferenceTransformer: "azuredevops_serviceendpoint_azurerm.azure[\"caf-foundation\"]" references: []
azuredevops_serviceendpoint_azurerm.azure["caf-foundation"]: Refreshing state... [id=63443445-2a27-4a52-8bbd-256af80d1b96]
2024-01-17T11:30:19.913Z [ERROR] plugin.(*GRPCProvider).ReadResource: error="rpc error: code = Unavailable desc = error reading from server: EOF"
2024-01-17T11:30:19.913Z [ERROR] vertex "azuredevops_serviceendpoint_azurerm.azure[\"caf-foundation\"]" error: Plugin did not respond
2024-01-17T11:30:19.913Z [ERROR] vertex "azuredevops_serviceendpoint_azurerm.azure (expand)" error: Plugin did not respond
2024-01-17T11:30:19.959Z [INFO]  backend/local: plan operation completed
ā•·
ā”‚ Error: Plugin did not respond
ā”‚ 
ā”‚   with azuredevops_serviceendpoint_azurerm.azure["caf-foundation"],
ā”‚   on azure-devops.tf line 29, in resource "azuredevops_serviceendpoint_azurerm" "azure":
ā”‚   29: resource "azuredevops_serviceendpoint_azurerm" "azure" {
ā”‚ 
ā”‚ The plugin encountered an error, and failed to respond to the plugin.(*GRPCProvider).ReadResource call. The plugin logs may contain more details.
ā•µ

Stack trace from the terraform-provider-azuredevops_v0.11.0 plugin:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x7874a0]

goroutine 52 [running]:
github.com/microsoft/terraform-provider-azuredevops/azuredevops/internal/service/serviceendpoint.resourceServiceEndpointAzureRMRead(0x0?, {0x836800?, 0x40003a98c0})
        github.com/microsoft/terraform-provider-azuredevops/azuredevops/internal/service/serviceendpoint/resource_serviceendpoint_azurerm.go:230 +0xc0
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*Resource).read(0xb37a20?, {0xb37a20?, 0x40004783c0?}, 0xd?, {0x836800?, 0x40003a98c0?})
        github.com/hashicorp/terraform-plugin-sdk/v2@v2.23.0/helper/schema/resource.go:712 +0x138
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*Resource).RefreshWithoutUpgrade(0x40003dc380, {0xb37a20, 0x40004783c0}, 0x40005991e0, {0x836800, 0x40003a98c0})
        github.com/hashicorp/terraform-plugin-sdk/v2@v2.23.0/helper/schema/resource.go:1015 +0x464
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*GRPCProviderServer).ReadResource(0x400013b338, {0xb37a20?, 0x40004782a0?}, 0x400021ee00)
        github.com/hashicorp/terraform-plugin-sdk/v2@v2.23.0/helper/schema/grpc_provider.go:613 +0x408
github.com/hashicorp/terraform-plugin-go/tfprotov5/tf5server.(*server).ReadResource(0x4000360aa0, {0xb37a20?, 0x40002538f0?}, 0x400044e240)
        github.com/hashicorp/terraform-plugin-go@v0.14.0/tfprotov5/tf5server/server.go:748 +0x3d4
github.com/hashicorp/terraform-plugin-go/tfprotov5/internal/tfplugin5._Provider_ReadResource_Handler({0x99b0c0?, 0x4000360aa0}, {0xb37a20, 0x40002538f0}, 0x400029a380, 0x0)
        github.com/hashicorp/terraform-plugin-go@v0.14.0/tfprotov5/internal/tfplugin5/tfplugin5_grpc.pb.go:349 +0x170
google.golang.org/grpc.(*Server).processUnaryRPC(0x40003b6000, {0xb3a6e0, 0x40004b4000}, 0x400053d8c0, 0x400041c5a0, 0x1074930, 0x0)
        google.golang.org/grpc@v1.56.3/server.go:1335 +0xc68
google.golang.org/grpc.(*Server).handleStream(0x40003b6000, {0xb3a6e0, 0x40004b4000}, 0x400053d8c0, 0x0)
        google.golang.org/grpc@v1.56.3/server.go:1712 +0x854
google.golang.org/grpc.(*Server).serveStreams.func1.1()
        google.golang.org/grpc@v1.56.3/server.go:947 +0xb4
created by google.golang.org/grpc.(*Server).serveStreams.func1
        google.golang.org/grpc@v1.56.3/server.go:958 +0x184

Error: The terraform-provider-azuredevops_v0.11.0 plugin crashed!

This is always indicative of a bug within the plugin. It would be immensely
helpful if you could report the crash with the plugin's maintainers so that it
can be fixed. The output above should help diagnose the issue.
arne21a commented 7 months ago

We found our "external trigger". The pat we were using lost privileges to read service connections. I think the provider should not segfault anyway. I build a short example to see the direct output of the go-api client for our case:

package main

import (
    "context"
    "log"
    "strconv"

    "github.com/microsoft/azure-devops-go-api/azuredevops/v7"
    "github.com/microsoft/azure-devops-go-api/azuredevops/v7/core"
    "github.com/microsoft/azure-devops-go-api/azuredevops/v7/serviceendpoint"
)

func main() {
    organizationUrl := "https://dev.azure.com/our-org" // todo: replace value with your organization url
    personalAccessToken := "xxx" // todo: replace value with your PAT
    //projectName := "other-test-project"
    projectName := "azure-foundation"
    includeFailed := true
    includeDetails := true
    // Create a connection to your organization
    connection := azuredevops.NewPatConnection(organizationUrl, personalAccessToken)

    ctx := context.Background()

    // Create a client to interact with the Core area
    coreClient, err := core.NewClient(ctx, connection)
    serviceEndpointClient, err := serviceendpoint.NewClient(ctx, connection)
    if err != nil {
        log.Fatal(err)
    }
    endpointArgs := serviceendpoint.GetServiceEndpointsArgs{
        Project:        &projectName,
        IncludeFailed:  &includeFailed,
        IncludeDetails: &includeDetails,
    }
    seResponseValue, err := serviceEndpointClient.GetServiceEndpoints(ctx, endpointArgs)
    if err != nil {
        log.Fatal(err)
    }
    log.Println(seResponseValue)
    if seResponseValue != nil {
        index := 0
        for _, serviceEndpoint := range *seResponseValue {
            if serviceEndpoint.Name != nil {
                log.Printf("Name[%v] = %v", index, *serviceEndpoint.Name)
                index++
            }
        }
    }

Output when not authorised:

2024/01/18 15:07:08 &[]

Output when authorised: (had to create some test connections)

2024/01/18 16:02:35 &[{<nil> 0x1400018cbd0 0x140000e4070 0x14000186198 0x1400018cbc0 <nil> 63443445-2a27-4a52-8bbd-256af80d1b96 0x1400019cff1 0x1400019cff0 0x1400018cb20 <nil> 0x1400018cc30 <nil> 0x14000232e40 0x1400018cb30 0x1400018cb40} {<nil> 0x1400018ce90 0x140000e4150 0x140001861a8 0x1400018ce80 <nil> 26f69c7c-a13c-4d12-af1f-c7e538e69ce3 0x1400019d161 0x1400019d160 0x1400018cde0 <nil> 0x1400018ced0 <nil> 0x14000233080 0x1400018cdf0 0x1400018ce00} {<nil> 0x1400018d040 0x140000e41c0 0x140001861b8 0x1400018d030 <nil> 25733a5a-2e93-4689-bda9-b5f3e7eae3c3 0x1400019d259 0x1400019d258 0x1400018cf90 <nil> 0x1400018d0a0 <nil> 0x14000233170 0x1400018cfa0 0x1400018cfb0}]
2024/01/18 16:02:35 Name[0] = caf-foundation - backuo
2024/01/18 16:02:35 Name[1] = asddas
2024/01/18 16:02:35 Name[2] = caf-foundation

So if there are no connections, the answer is not nil, its empty. That might be a problem somewhere in the implementation.

Sorry if my analysis is completely mislead, i actually never used go before and had to do some guessing and use chatGPT ;) So please take no offence if i am blaming something wrongly.