Closed mattBaumBeneva closed 11 months ago
This bug is still present in provider version 1.47.0. Tested today.
@Dynatrace-Reinhard-Pilz or @kishikawa12 , any thoughts on the cause?
I will look into that before the next release.
@Dynatrace-Reinhard-Pilz , very much appreciated.
Hello @mattBaumBeneva,
The upcoming release will contain the necessary routines that allow us to capture HTTP traffic for the resource dynatrace_automation_workflow
. In order to capture that information you will have to set these environment variables:
DYNATRACE_LOG_HTTP
= terraform-provider-dynatrace.http.log
DYNATRACE_HTTP_RESPONSE
= true
With these environment variables you will get the complete HTTP traffic - including the negotiation with sso.dynatrace.com
- dumped into a file named terraform-provider-dynatrace.http.log
.
You mentioned initially that you're able to create the Workflow using Postman. Are you using Postman on the same host that executes Terraform for you? We've had issues in the past where Terraform was getting executed within a Jenkins pipeline - and networking limitations for the Jenkins workers interfered with OAuth2 - which requires to reach out to sso.dynatrace.com
in addition to the hosts that serve the Dynatrace Environment.
The firewall issue to which you are refering was at our company, so it should be fine now (we apply other ressources with the OAuth2 flow). I will perform tests once the new version is released and provide the logs.
@Dynatrace-Reinhard-Pilz , we cannot actually test the fix. We are blocked by: https://github.com/dynatrace-oss/terraform-provider-dynatrace/issues/366
@Dynatrace-Reinhard-Pilz , I was able to test. I no longer see a connection reset, but the debugging log indicates the plugin is crashing (panic) on a Segfault:
12:03:57 │ Error: Plugin did not respond
12:03:57 │
12:03:57 │ with dynatrace_automation_workflow.easyTravel_guardian_Validation_terraform,
12:03:57 │ on testWorkflow.tf line 11, in resource "dynatrace_automation_workflow" "easyTravel_guardian_Validation_terraform":
12:03:57 │ 11: resource "dynatrace_automation_workflow" "easyTravel_guardian_Validation_terraform" {
12:03:57 │
12:03:57 │ The plugin encountered an error, and failed to respond to the
12:03:57 │ plugin.(*GRPCProvider).ApplyResourceChange call. The plugin logs may
12:03:57 │ contain more details.
12:03:57 Stack trace from the terraform-provider-dynatrace_v1.47.2 plugin:
12:03:57
12:03:57 panic: runtime error: invalid memory address or nil pointer dereference
12:03:57 [signal SIGSEGV: segmentation violation code=0x1 addr=0x40 pc=0x93b1a3]
12:03:57
12:03:57 goroutine 191 [running]:
12:03:57 github.com/dynatrace-oss/terraform-provider-dynatrace/dynatrace/api/automation/workflows.(*MyRoundTripper).RoundTrip(0xc000d12cf0, 0xc000df0600)
12:03:57 github.com/dynatrace-oss/terraform-provider-dynatrace/dynatrace/api/automation/workflows/service.go:65 +0x3a3
12:03:57 golang.org/x/oauth2.(*Transport).RoundTrip(0xc000c90c40, 0xc001195200)
12:03:57 golang.org/x/oauth2@v0.11.0/transport.go:55 +0x3ea
12:03:57 net/http.send(0xc001195200, {0x167bfa0, 0xc000c90c40}, {0x1422100?, 0x1?, 0x0?})
12:03:57 net/http/client.go:251 +0x5f7
12:03:57 net/http.(*Client).send(0xc000d359e0, 0xc001195200, {0xc000d0cc00?, 0x1000000010b6200?, 0x0?})
12:03:57 net/http/client.go:175 +0x9b
12:03:57 net/http.(*Client).do(0xc000d359e0, 0xc001195200)
12:03:57 net/http/client.go:715 +0x8fc
12:03:57 net/http.(*Client).Do(...)
12:03:57 net/http/client.go:581
12:03:57 github.com/dynatrace-oss/terraform-provider-dynatrace/monaco/pkg/rest.executeRequest.func1()
12:03:57 github.com/dynatrace-oss/terraform-provider-dynatrace/monaco/pkg/rest/request.go:133 +0x85
12:03:57 github.com/dynatrace-oss/terraform-provider-dynatrace/monaco/pkg/rest.executeWithRateLimiter(0xc000c5c730)
12:03:57 github.com/dynatrace-oss/terraform-provider-dynatrace/monaco/pkg/rest/request.go:164 +0x66
12:03:57 github.com/dynatrace-oss/terraform-provider-dynatrace/monaco/pkg/rest.executeRequest(0xc000d359e0, 0xc001195200)
12:03:57 github.com/dynatrace-oss/terraform-provider-dynatrace/monaco/pkg/rest/request.go:132 +0x134
12:03:57 github.com/dynatrace-oss/terraform-provider-dynatrace/monaco/pkg/rest.Post(0x0?, {0xc000956140, 0x44}, {0xc0003c1500, 0x2c5, 0x300})
12:03:57 github.com/dynatrace-oss/terraform-provider-dynatrace/monaco/pkg/rest/request.go:80 +0x173
12:03:57 github.com/dynatrace-oss/terraform-provider-dynatrace/monaco/pkg/client/automation.Client.INSERT({{0xc00004cb70?, 0xc00020b880?}, 0xc000d359e0?, 0xc000286630?}, 0xd?, {0xc0003c1500, 0x2c5, 0x300})
12:03:57 github.com/dynatrace-oss/terraform-provider-dynatrace/monaco/pkg/client/automation/client.go:153 +0x12d
12:03:57 github.com/dynatrace-oss/terraform-provider-dynatrace/dynatrace/api/automation/workflows.(*service).Create(0x40a5b3?, 0xc00020b880)
12:03:57 github.com/dynatrace-oss/terraform-provider-dynatrace/dynatrace/api/automation/workflows/service.go:132 +0x7e
12:03:57 github.com/dynatrace-oss/terraform-provider-dynatrace/dynatrace/settings.(*GenericCRUDService[...]).CreateWithContext(0xc000d126d0, {0x16b2a10, 0xc000bdaf60}, {0x16a5ec0, 0xc00020b880?})
12:03:57 github.com/dynatrace-oss/terraform-provider-dynatrace/dynatrace/settings/generic_crud_service.go:60 +0xbd
12:03:57 github.com/dynatrace-oss/terraform-provider-dynatrace/resources.(*Generic).Create(0xc00071baa0, {0x16b2a10, 0xc000bdaf60}, 0xc000be6900, {0x109d960, 0xc0005028c0})
12:03:57 github.com/dynatrace-oss/terraform-provider-dynatrace/resources/generic.go:181 +0x270
12:03:57 github.com/dynatrace-oss/terraform-provider-dynatrace/provider/logging.Enable.func1({0x16b2a10, 0xc000bdaf60}, 0x0?, {0x109d960, 0xc0005028c0})
12:03:57 github.com/dynatrace-oss/terraform-provider-dynatrace/provider/logging/logging.go:103 +0x83
12:03:57 github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*Resource).create(0xc0007d7340, {0x16b2a48, 0xc000c77230}, 0xd?, {0x109d960, 0xc0005028c0})
12:03:57 github.com/hashicorp/terraform-plugin-sdk/v2@v2.25.0/helper/schema/resource.go:707 +0x12e
12:03:57 github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*Resource).Apply(0xc0007d7340, {0x16b2a48, 0xc000c77230}, 0xc000d8a9c0, 0xc000199100, {0x109d960, 0xc0005028c0})
12:03:57 github.com/hashicorp/terraform-plugin-sdk/v2@v2.25.0/helper/schema/resource.go:837 +0xa85
12:03:57 github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*GRPCProviderServer).ApplyResourceChange(0xc000010210, {0x16b2a48?, 0xc000c77110?}, 0xc00012f900)
12:03:57 github.com/hashicorp/terraform-plugin-sdk/v2@v2.25.0/helper/schema/grpc_provider.go:1021 +0xe8d
12:03:57 github.com/hashicorp/terraform-plugin-go/tfprotov5/tf5server.(*server).ApplyResourceChange(0xc000000e60, {0x16b2a48?, 0xc000c768d0?}, 0xc0002e6d90)
12:03:57 github.com/hashicorp/terraform-plugin-go@v0.14.3/tfprotov5/tf5server/server.go:818 +0x574
12:03:57 github.com/hashicorp/terraform-plugin-go/tfprotov5/internal/tfplugin5._Provider_ApplyResourceChange_Handler({0x13fd960?, 0xc000000e60}, {0x16b2a48, 0xc000c768d0}, 0xc0002e6d20, 0x0)
12:03:57 github.com/hashicorp/terraform-plugin-go@v0.14.3/tfprotov5/internal/tfplugin5/tfplugin5_grpc.pb.go:385 +0x170
12:03:57 google.golang.org/grpc.(*Server).processUnaryRPC(0xc0004ea000, {0x16d6c80, 0xc0007216c0}, 0xc000c607e0, 0xc0007def00, 0x1fbd180, 0x0)
12:03:57 google.golang.org/grpc@v1.56.3/server.go:1335 +0xdf0
12:03:57 google.golang.org/grpc.(*Server).handleStream(0xc0004ea000, {0x16d6c80, 0xc0007216c0}, 0xc000c607e0, 0x0)
12:03:57 google.golang.org/grpc@v1.56.3/server.go:1712 +0xa2f
12:03:57 google.golang.org/grpc.(*Server).serveStreams.func1.1()
12:03:57 google.golang.org/grpc@v1.56.3/server.go:947 +0xca
12:03:57 created by google.golang.org/grpc.(*Server).serveStreams.func1
12:03:57 google.golang.org/grpc@v1.56.3/server.go:958 +0x15c
12:03:57
12:03:57 Error: The terraform-provider-dynatrace_v1.47.2 plugin crashed!
12:03:57
12:03:57 This is always indicative of a bug within the plugin. It would be immensely
12:03:57 helpful if you could report the crash with the plugin's maintainers so that it
12:03:57 can be fixed. The output above should help diagnose the issue.
12:03:57
Ok, the good news about this error: It tells me already that the HTTP request never was made - instead an error got thrown. The bad news about it: The debugging code wasn't prepared for a situation where no HTTP Response is present - hence the plugin crash.
I will have to push out yet another code change in order to deal with that.
Odds are, however, that the error message the provider is getting back here is nothing else than the original read tcp 10.0.2.100:35668->23.22.184.182:443: read: connection reset by peer
.
If time allows it, I will create v1.47.3 still today so you're able to run another test with the environment variables.
Hello @mattBaumBeneva, v1.47.3 has been released. Whenever you have time, please re-run with the environment variables. The provider shouldn't crash in this case anymore.
@Dynatrace-Reinhard-Pilz , I have emailed the entire log to your Dynatrace address (copied from our Jenkins output log).
Thanks a lot for the logs. I believe I found something that's worth investigating:
In our earlier ticket, where fetching the OAuth Bearer Token failed, the root cause was, if I'm not mistaken, that the Jenkins workers were not able to reach out to sso.dynatrace.com
. Once that was opened up, things went back to normal.
With the resource dynatrace_automation_workflow
yet another host name comes into play. As opposed to most other resources, which require to connect to https://<tenantid>.live.dynatrace.com
, addressing the REST API for Gen3 functionality requires Terraform to connect to https://<tenantid>.apps.dynatrace.com
.
The logs are unfortunately scattered with HTTP traffic that originates from various other resources, therefore it's not easy to spot. But with a bit of automated cross matching of expected response content it became clear, that the only requests you cannot find response content for within the logs are the ones addressing https://<tenantid>.apps.dynatrace.com
(which in your case are just the workflows at the moment).
Can you check with your networking team about that theory?
I was able to confirm that those flows don't pass our firewall. I have requested they be allowed and I will respond here once I have been able to test. If possible, I think the provider should catch and report such errors, with the failing URL, and report them to the user (for future customers who may run into this issue).
@Dynatrace-Reinhard-Pilz , we have allowed these connections to pass through our firewall. This corrected the issue and we have successfully applied the test workflow above. I am closing this issue as resolved. Thanks!
That's good news. And yes, I'm already working on recognizing that situation automatically so it doesn't require debug logs to get turned on.
Describe the bug We are attempting to create Automation Workflows via Terraform. We have added the required scopes to our OAuth client but we obtain the following network error when applying:
We have simplified the Workflow used for testing, as our real workflow is complex:
When applying the resulting JSON via the REST API with Postman:
{"isPrivate":true,"schemaVersion":3,"tasks":{"run_validation_easy_travel":{"action":"dynatrace.site.reliability.guardian:validate-guardian-action","active":false,"concurrency":null,"description":"Automation action to start a Site Reliability Guardian validation","input":{"executionId":"{{ execution().id }}","objectId":"vu9U3hXa3q0AAAABADFhcHA6ZHluYXRyYWNlLnNpdGUucmVsaWFiaWxpdHkuZ3VhcmRpYW46Z3VhcmRpYW5zAAZ0ZW5hbnQABnRlbmFudAAkY2RmYzAzNDEtZjhjNC0zNDZkLTkxZmUtMDZmZmY4NTcxZThhvu9U3hXa3q0","timeframeInputType":"timeframeSelector","timeframeSelector":{"from":"now-7d","to":"now"}},"name":"run_validation_easy_travel","position":{"x":2,"y":1},"timeout":900}},"title":"easyTravel guardian Validation Terraform"}
We obtain an HTTP 201, as expected:
Expected behavior The provider should correctly create the Workflow. API-level errors should be caught and reported clearly.
Additional context Provider v1.45.0.