Open jeremmfr opened 1 year ago
Hi @jeremmfr 👋 Thank you for reporting this issue and sorry for the trouble here. I just wanted to acknowledge that the maintainers have noticed this bug report and do plan on taking a further look into this, but that effort will probably happen early next week. In the meantime, if you have any familiarity with pprof
and flamegraph profiling, that might be a good methodology for starting to narrow down where the unexpected extra time is being spent.
Hi,
I've run profiling with pprof, the provider hashicups
, attribute items
with type schema.SetNestedBlock
, and a resource hashicups_order
with 1000 elements of dynamic items
.
I've also updated all the dependencies for the provider hashicups
:
require (
github.com/hashicorp-demoapp/hashicups-client-go v0.1.0
- github.com/hashicorp/terraform-plugin-docs v0.14.1
- github.com/hashicorp/terraform-plugin-framework v1.2.0
- github.com/hashicorp/terraform-plugin-go v0.15.0
- github.com/hashicorp/terraform-plugin-log v0.8.0
- github.com/hashicorp/terraform-plugin-testing v1.2.0
+ github.com/hashicorp/terraform-plugin-docs v0.15.0
+ github.com/hashicorp/terraform-plugin-framework v1.3.1
+ github.com/hashicorp/terraform-plugin-go v0.16.0
+ github.com/hashicorp/terraform-plugin-log v0.9.0
+ github.com/hashicorp/terraform-plugin-testing v1.3.0
)
I attach a png and a flamegrah (flamegraph in gist cpu.svg)
A profiling result for a complete call of PlanResourceChange
and ValidateResourceConfig
instead of 30 seconds dump.
(flamegraph in gist cpu2.svg)
A profiling result for a complete call of ValidateResourceConfig
(~7 seconds)
(flamegraph in gist cpu3.svg)
Hi @jeremmfr 👋 Would you be able to try out https://github.com/hashicorp/terraform-plugin-go/pull/308? From my local testing with the hashicups-pf example, it reduced the plan time by over half. I'm curious what level of real world benefit you might see as well.
Hi @bflad :wave:
We launch multiple scenarios to compare potential benefit.
The PR reduce time but there is always a gap between SDKv2 and framework version of provider.
The result:
in seconds | SDKv2 | Framework | Framework with terraform-plugin-go#308 |
---|---|---|---|
hashicups_order with 1000 items block |
~ 3.8 | ~ 48 | ~ 14 |
junos_security_address_book with 1000 network_address block |
~ 1.3 | ~ 78 | ~ 17 |
junos_security_address_book with 1000 address_set block |
~ 3.2 | ~ 198 | ~ 45 |
The difference between network_address
block and address_set
block is that network_address
block has only primitive attributes (String) and address_set
block has primitive attributes (String) and primitive set attributes (String sets)
I run new profiling with pprof in last scenario (junos_security_address_book
with 1000 address_set
block and framework with terraform-plugin-go#308)
The result in png and flamegraph (in gist)
Thanks for the confirmation and the additional context, @jeremmfr. 😄 Given the overall very positive effects, I've pulled in the upstream changes and will cut a release of those shortly.
I'm guessing there are two threads to pull on next with the framework side of things:
One other reality is that the framework went GA with its data handling half in terraform-plugin-go and half in terraform-plugin-framework. There is a lot of passing back and forth between the type systems in the PlanResourceChange
logic. Ideally, everything would have been done in terraform-plugin-framework and only using terraform-plugin-go's type system for the request/response handling. It's unfortunately very difficult to change that until a major version of the framework to remove the directly exposed terraform-plugin-go types in certain places.
@jeremmfr if you have a moment, I'm curious if https://github.com/hashicorp/terraform-plugin-framework/commit/9fe2ca5dcf0da75f85d08930c2480d82b49ed1ae (which should include terraform-plugin-go#308) offers some slight benefits with your provider. I'm guessing it'll be fairly minimal, but it was low hanging fruit to prevent unnecessary memory allocation and cleanup work for the runtime.
Hi @bflad , I ran the same previous scenarios with the PR #791, but the difference is not visible on the Plan time.
@bflad Any update on this issue? Or any suggestion on workaround?
Module version
Description
After migrating resources of my provider jeremmfr/junos from SDK plugin to this framework plugin, the time to run a
terraform plan
with a config that have many block sets has been increased very significantly.For 1000 items of block set, a
terraform plan
(with empty state) with provider including modulesdk/v2
take ~1.3 seconds and take ~ 70 seconds with provider includingplugin-framework
. A user of my provider has a time of 19 minutes to run terraform plan with the new version of the provider includingplugin-framework
instead of 52 seconds with the previous version of the provider including modulesdk/v2
Tests
I tested to remove all
PlanModifiers
andValidators
fields on each attribute, andValidateConfig
function for resources, using the latest version of this module (v1.3.1) but there's no change. The problem disappears if replaceschema.SetNestedBlock
withschema.ListNestedBlock
: 70 seconds to 1 second.I reproduce the problem with provider
hashicups
andhashicups_order
resources. I tested 1000 blocksitems
on a resourcehashicups_order
with different modifications ofitems
block.hashicorp/terraform-provider-hashicups schema.TypeSet : 3.8 - 4 seconds hashicorp/terraform-provider-hashicups schema.TypeList : 0.5 - 0.8 seconds hashicorp/terraform-provider-hashicups-pf schema.SetNestedBlock : 49.1 - 49.7 seconds hashicorp/terraform-provider-hashicups-pf schema.ListNestedBlock : 0.8 - 1.1 seconds hashicorp/terraform-provider-hashicups-pf schema.SetNestedAttribute : 49.0 - 49.4 seconds hashicorp/terraform-provider-hashicups-pf schema.ListNestedAttribute : 0.7 - 0.9 seconds
Relevant provider source code
To test with block sets, clone hashicorp/terraform-provider-hashicups-pf and apply this patch :
& disable
hashicups.NewClient
code in provider configure functionTo test with nested atttribute sets, clone hashicorp/terraform-provider-hashicups-pf and apply this patch :
& disable
hashicups.NewClient
code in provider configure functionTerraform Configuration Files
To test with block sets:
To test with nested attribute sets:
Expected Behavior
The
terraform plan
command takes a reasonable time to run when there are many block sets.Actual Behavior
The
terraform plan
command takes a very long time to run when there are many block sets.Steps to Reproduce
terraform init
terraform plan
References
jeremmfr/terraform-provider-junos#498