fivetran / terraform-provider-fivetran

Terraform Provider for Fivetran
https://fivetran.com
Apache License 2.0
40 stars 23 forks source link

fivetran_connector_schema_config resource unusable (Very slow) for schemas with a lot of tables #263

Closed jossehuybrechts closed 5 months ago

jossehuybrechts commented 7 months ago

Describe the bug When trying to plan and apply a schema_config for a connector that contains for example 180 tables to enable, it takes more or less half an hour to just do a plan of this resource. This is not really usable when developing any code.

To Reproduce Run terraform plan on a resource with 180 tables enabled (nothing on column level specified) in the schema of the connector (we are using oracle hva connector).

Expected behavior I assumed it would only take a couple of seconds to plan and apply as it translates directly to a single API call.

Logs & Output 2024-02-19T09:26:10.070+0100 [DEBUG] provider.terraform-provider-fivetran_v1.1.13: Marking Computed attributes with null configuration values as unknown (known after apply) in the plan to prevent potential Terraform errors: @caller=github.com/hashicorp/terraform-plugin-framework@v1.4.2/internal/fwserver/server_planresourcechange.go:195 @module=sdk.framework tf_provider_addr=registry.terraform.io/providers/fivetran/fivetran tf_req_id=e89599a0-08f1-a3df-4877-9fcd91e18067 tf_resource_type=fivetran_connector_schema_config tf_mux_provider="proto6server.Server" tf_rpc=PlanResourceChange timestamp="2024-02-19T09:26:10.070+0100" 2024-02-19T09:26:10.070+0100 [DEBUG] provider.terraform-provider-fivetran_v1.1.13: marking computed attribute that is null in the config as unknown: @module=sdk.framework tf_attribute_path="AttributeName(\"id\")" tf_mux_provider="proto6server.Server" tf_provider_addr=registry.terraform.io/providers/fivetran/fivetran tf_req_id=e89599a0-08f1-a3df-4877-9fcd91e18067 @caller=github.com/hashicorp/terraform-plugin-framework@v1.4.2/internal/fwserver/server_planresourcechange.go:399 tf_resource_type=fivetran_connector_schema_config tf_rpc=PlanResourceChange timestamp="2024-02-19T09:26:10.070+0100" 2024-02-19T09:26:55.530+0100 [DEBUG] provider.terraform-provider-fivetran_v1.1.13: Marking Computed attributes with null configuration values as unknown (known after apply) in the plan to prevent potential Terraform errors: @caller=github.com/hashicorp/terraform-plugin-framework@v1.4.2/internal/fwserver/server_planresourcechange.go:195 tf_mux_provider="proto6server.Server" tf_resource_type=fivetran_connector_schema_config @module=sdk.framework tf_provider_addr=registry.terraform.io/providers/fivetran/fivetran tf_req_id=10740c92-6060-443a-340b-4b4ed863028e tf_rpc=PlanResourceChange timestamp="2024-02-19T09:26:55.530+0100" 2024-02-19T09:26:55.530+0100 [DEBUG] provider.terraform-provider-fivetran_v1.1.13: marking computed attribute that is null in the config as unknown: @module=sdk.framework tf_attribute_path="AttributeName(\"id\")" tf_mux_provider="proto6server.Server" tf_provider_addr=registry.terraform.io/providers/fivetran/fivetran tf_req_id=10740c92-6060-443a-340b-4b4ed863028e tf_rpc=PlanResourceChange @caller=github.com/hashicorp/terraform-plugin-framework@v1.4.2/internal/fwserver/server_planresourcechange.go:399 tf_resource_type=fivetran_connector_schema_config timestamp="2024-02-19T09:26:55.530+0100" 2024-02-19T09:26:55.992+0100 [DEBUG] provider.terraform-provider-fivetran_v1.1.13: Marking Computed attributes with null configuration values as unknown (known after apply) in the plan to prevent potential Terraform errors: @caller=github.com/hashicorp/terraform-plugin-framework@v1.4.2/internal/fwserver/server_planresourcechange.go:195 tf_req_id=f654132a-46e4-380c-b14e-f0a7526d5df7 tf_resource_type=fivetran_connector_schema_config tf_rpc=PlanResourceChange @module=sdk.framework tf_mux_provider="proto6server.Server" tf_provider_addr=registry.terraform.io/providers/fivetran/fivetran timestamp="2024-02-19T09:26:55.992+0100" 2024-02-19T09:26:55.992+0100 [DEBUG] provider.terraform-provider-fivetran_v1.1.13: marking computed attribute that is null in the config as unknown: tf_attribute_path="AttributeName(\"id\")" tf_mux_provider="proto6server.Server" tf_provider_addr=registry.terraform.io/providers/fivetran/fivetran tf_req_id=f654132a-46e4-380c-b14e-f0a7526d5df7 tf_resource_type=fivetran_connector_schema_config tf_rpc=PlanResourceChange @caller=github.com/hashicorp/terraform-plugin-framework@v1.4.2/internal/fwserver/server_planresourcechange.go:399 @module=sdk.framework timestamp="2024-02-19T09:26:55.992+0100" 2024-02-19T09:27:34.798+0100 [DEBUG] provider.terraform-provider-fivetran_v1.1.13: Marking Computed attributes with null configuration values as unknown (known after apply) in the plan to prevent potential Terraform errors: @caller=github.com/hashicorp/terraform-plugin-framework@v1.4.2/internal/fwserver/server_planresourcechange.go:195 tf_mux_provider="proto6server.Server" tf_provider_addr=registry.terraform.io/providers/fivetran/fivetran tf_resource_type=fivetran_connector_schema_config @module=sdk.framework tf_req_id=a65573c9-0050-30a7-6b54-12b6ee05f135 tf_rpc=PlanResourceChange timestamp="2024-02-19T09:27:34.798+0100" 2024-02-19T09:27:34.798+0100 [DEBUG] provider.terraform-provider-fivetran_v1.1.13: marking computed attribute that is null in the config as unknown: @caller=github.com/hashicorp/terraform-plugin-framework@v1.4.2/internal/fwserver/server_planresourcechange.go:399 tf_attribute_path="AttributeName(\"id\")" tf_mux_provider="proto6server.Server" tf_req_id=a65573c9-0050-30a7-6b54-12b6ee05f135 @module=sdk.framework tf_provider_addr=registry.terraform.io/providers/fivetran/fivetran tf_resource_type=fivetran_connector_schema_config tf_rpc=PlanResourceChange timestamp="2024-02-19T09:27:34.798+0100"

(and still going 25 minutes later without new logs)

Plugin version: Version 1.1.13

beevital commented 7 months ago

@jossehuybrechts unfortunately we have to reload schema config for the source before applying any settings. That is a weak side of Fivetran API. If the source have a lot of schemas/tables - it may take a lot of time. Even if you need to enable a small part of it. Currently we are working on a solution that provides an ability to apply schema settings without full schema reload, so potentially it may solve the problem.

beevital commented 6 months ago

@jossehuybrechts could you please try latest 1.1.17 and report about performance in your case?

jossehuybrechts commented 6 months ago

Hi, @beevital , I tried the newer version. But performance didn't change. Still a very long time to even plan a schema config block.

beevital commented 6 months ago

To even plan... hmm. It's very strange - on plan there's no any API calls. Could you please provide an example of your config?

beevital commented 6 months ago

Run terraform plan on a resource with 180 tables enabled

I see this, I'll try to reproduce on testing env.

beevital commented 6 months ago

But just in case - share your config example.

jossehuybrechts commented 6 months ago

Hi the configuration looks like this. Where we loop over our connectors, loop over their schemas, tables and columns. resource "fivetran_connector_schema_config" "connector_schema_config" { for_each = { for name, con in var.connectors : name => con.schema_config if con.schema_config != null } connector_id = fivetran_connector.connector[each.key].id schema_change_handling = each.value.schema_change_handling dynamic "schema" { for_each = each.value.schemas content { name = schema.key enabled = schema.value.enabled dynamic "table" { for_each = schema.value.tables content { name = table.key enabled = table.value.enabled sync_mode = var.destination.service == "aws_msk_wh" ? null : table.value.sync_mode dynamic "column" { for_each = table.value.columns content { name = column.key hashed = column.value.hashed enabled = column.value.enabled } } } } } } }

beevital commented 6 months ago

Where we loop over our connectors

So you have multiple resources on a plan? How many connectors do you have? I'm interested in range ~10, ~100, ~1000 ? I need it to prepare example config. To have same resources amount.

jossehuybrechts commented 6 months ago

I think is this plan we have 6 connectors, but when disabling one schema config for the big connector with 180 tables the plan happens in a couple of seconds. But with the one extra big schema config resource it takes more than 15 minutes.

beevital commented 6 months ago

but when disabling one schema config for the big connector with 180 tables the plan happens in a couple of seconds.

hm, yep ,looks like amount of connectors doesn't matter. Thank you. Just one more question - could you provide a source of schema config for this particular connector with 180 tables?

beevital commented 6 months ago

I need to know: are these tables in single schema or in different ones?

jossehuybrechts commented 6 months ago

They are in a single schema

beevital commented 6 months ago

okay, I'll try to reproduce it today. Thank you for quick feedback!

beevital commented 6 months ago

UPD: Reproduced issue in test env. Seems like a dead lock on plan stage.

beevital commented 6 months ago

The problem is that we store tables as a set. And it (terraform-framework) calls "deepEquals" on each new table element in plan with each existing item in set. So complexity of addition is at least O(n^2) + deep equals is slow operation. Will try to optimise it with hashing...

Quick solution might be to use list instead of a set, but reordering tables in config will cause unwanted changes (drifrting changes). Also we can't guarantee tables order in response... In elder versions of provider (based on terraform-plugin-sdk we used hash functions to speed up handling. But in current terraform-plugin-framework - there's no such option anymore...)

Tested approach with ListNestedValue - it works a way faster, but fails because of reordering elements.

Looks like the only way is to use CustomTypes approach here. Implementation will take time...

jossehuybrechts commented 6 months ago

Hi @beevital, any update on this?

beevital commented 6 months ago

Hey @jossehuybrechts ! Yes, I've prepared an alternative solution. You'll have to update your .tf configuration, but it should solve the issues.

beevital commented 6 months ago

@jossehuybrechts try out v1.1.18. You'll need to update your configuration to use schemas field instead of schema. The new field is based on Map, refer to example in docs.

Also you can use schemas_json : https://registry.terraform.io/providers/fivetran/fivetran/latest/docs/guides/schema_json

jossehuybrechts commented 6 months ago

Hi @beevital With the new version and with the schemas field it works very good! (didn't manage to get the schemas_json working, but not that important for us as the schames field is working) Thanks for fixing this.

JordyHeusdensDT commented 5 months ago

Hi @beevital

Josse is OOO and I was adding some new connectors and got the next error.


│ Error: Provider produced inconsistent result after apply
│ 
│ When applying changes to
│ module.ingestion_fivetran_destination["aws_msk_tst"].fivetran_connector_schema_config.connector_schema_config["duott_job"], provider
│ "provider[\"registry.terraform.io/fivetran/fivetran\"]" produced an unexpected new value: .schemas: new element "ORDS_METADATA" has
│ appeared.
│ 
│ This is a bug in the provider, which should be reported in the provider's own issue tracker.
╵
╷
│ Error: Provider produced inconsistent result after apply
│ 
│ When applying changes to
│ module.ingestion_fivetran_destination["aws_msk_tst"].fivetran_connector_schema_config.connector_schema_config["duott_job"], provider
│ "provider[\"registry.terraform.io/fivetran/fivetran\"]" produced an unexpected new value: .schemas: new element "TOAD" has appeared.
│ 
│ This is a bug in the provider, which should be reported in the provider's own issue tracker.
╵
╷
│ Error: Provider produced inconsistent result after apply
│ 
│ When applying changes to
│ module.ingestion_fivetran_destination["aws_msk_tst"].fivetran_connector_schema_config.connector_schema_config["duott_job"], provider
│ "provider[\"registry.terraform.io/fivetran/fivetran\"]" produced an unexpected new value: .schemas: new element "UTILS_DAO" has appeared.
│ 
│ This is a bug in the provider, which should be reported in the provider's own issue tracker.

The new element is a schema of our source Oracle. We wanted to add one table from a schema (which is not in this list). For some reason all the schemas appeared and now when it wants to change it again it is failing. For some weird reason we are only getting this with this connector and not other onces.

JordyHeusdensDT commented 5 months ago

@beevital in case you missed this

beevital commented 5 months ago

Hi @JordyHeusdensDT , I've just released a fix for this issue in v1.1.22.

JordyHeusdensDT commented 5 months ago

Hi @beevital Thanks, it indeed solves our issue! Thanks for the quick fix.