confluentinc / terraform-provider-confluent

Terraform Provider for Confluent
Apache License 2.0
118 stars 61 forks source link

`rest_endpoint` is not (correctly) imported when importing a migrated `confluent_mirror_topic` into a `confluent_topic` #353

Closed FireDrunk closed 4 months ago

FireDrunk commented 4 months ago

We are testing out the full migration process of Kafka OnPremise using Cluster Linking. This configuration is managed by Terraform, and after migrating the following resource:

resource "confluent_kafka_mirror_topic" "topic" {
  source_kafka_topic {
    topic_name = var.topic_info.mirror_info.source_topic_name
  }

  cluster_link {
    link_name = local.effective_cluster_link
  }

  kafka_cluster {
    id            = data.confluent_kafka_cluster.shared-cluster.id
    rest_endpoint = data.confluent_kafka_cluster.shared-cluster.rest_endpoint

    credentials {
      key    = data.azurerm_key_vault_secret.manager_api_key_id.value
      secret = data.azurerm_key_vault_secret.manager_api_key_secret.value
    }
  }

  mirror_topic_name = var.topic_info.name

  status = var.topic_info.mirror_info.status

  lifecycle {
    prevent_destroy = true
  }
}

to this resource:

resource "confluent_kafka_topic" "topic" {
  kafka_cluster {
    id = data.confluent_kafka_cluster.shared-cluster.id
  }

  topic_name       = var.topic_info.name
  partitions_count = var.topic_info.partitions_count
  rest_endpoint    = data.confluent_kafka_cluster.shared-cluster.rest_endpoint

  config = {
    "cleanup.policy"                      = var.topic_info.config.cleanup_policy
    "delete.retention.ms"                 = var.topic_info.config.delete_retention_days * 86400000
    "max.compaction.lag.ms"               = var.topic_info.config.max_compaction_lag_ms
    "max.message.bytes"                   = var.topic_info.config.max_message_bytes
    "message.timestamp.difference.max.ms" = var.topic_info.config.message_timestamp_difference_max_ms
    "message.timestamp.type"              = var.topic_info.config.message_timestamp_type
    "min.compaction.lag.ms"               = var.topic_info.config.min_compaction_lag_ms
    "min.insync.replicas"                 = var.topic_info.config.min_insync_replicas
    "retention.bytes"                     = var.topic_info.config.retention_bytes
    "retention.ms"                        = var.topic_info.config.retention_days * 86400000
    "segment.bytes"                       = var.topic_info.config.segment_bytes
    "segment.ms"                          = var.topic_info.config.segment_ms
  }

  credentials {
    key    = data.azurerm_key_vault_secret.manager_api_key_id.value
    secret = data.azurerm_key_vault_secret.manager_api_key_secret.value
  }

  lifecycle {
    prevent_destroy = true
  }
}

by importing it manually. When importing it, it seems to work fine (when setting the correct env vars). The state of the object after import:

resource "confluent_kafka_topic" "topic" {
    config           = {
        "cleanup.policy"                      = "delete"
        "delete.retention.ms"                 = "86400000"
        "max.compaction.lag.ms"               = "9223372036854775807"
        "max.message.bytes"                   = "1000000"
        "message.timestamp.difference.max.ms" = "9223372036854775807"
        "message.timestamp.type"              = "CreateTime"
        "min.compaction.lag.ms"               = "0"
        "retention.bytes"                     = "-1"
        "retention.ms"                        = "604800000"
    }
    id               = "lkc-<id>/dummy-topic"
    partitions_count = 8
    topic_name       = "dummy-topic"

    kafka_cluster {
        id = "lkc-<id>"
    }
}

But after planning it wants to replace the topic:

# module.router.module.confluent_topics["dummy-topic"].confluent_kafka_topic.topic[0] must be replaced
-/+ resource "confluent_kafka_topic" "topic" {
      ~ config           = {
          ~ "max.message.bytes"                   = "1000000" -> "2097164"
          + "min.insync.replicas"                 = "2"
          ~ "retention.ms"                        = "604800000" -> "86400000"
          + "segment.bytes"                       = "104857600"
          + "segment.ms"                          = "604800000"
            # (7 unchanged elements hidden)
        }
      ~ id               = "lkc-<id>/dummy-topic" -> (known after apply)
      ~ partitions_count = 8 -> 6
      + rest_endpoint    = "[https://lkc-<id>-<id>.westeurope.azure.glb.confluent.cloud:443](https://lkc-<id>-<id>.westeurope.azure.glb.confluent.cloud/)" # forces replacement
        # (1 unchanged attribute hidden)

      + credentials {
          + key    = (sensitive value)
          + secret = (sensitive value)
        }

        # (1 unchanged block hidden)
    }

Plan: 1 to add, 0 to change, 1 to destroy.

I presume the reason it's being replaced is that its rest_endpoint is missing in the Terraform state, but that's incorrect, since it got imported using that endpoint.

I think somewhere around here: https://github.com/confluentinc/terraform-provider-confluent/blob/master/internal/provider/resource_kafka_topic.go#L282

I think it should set the rest_endpoint on the topic that was used for importing the topic, or that should be an import flag.

linouk23 commented 4 months ago

Thanks for creating new issue @FireDrunk!

Could you share more details about "migration" of

resource "confluent_kafka_mirror_topic" "topic" 

to

resource "confluent_kafka_topic" "topic" {

?

Do you mean you updated mirror_topic's status (via failover or something) such that it became a regular topic that you imported later?

linouk23 commented 4 months ago

For the second question: could you confirm you were using these commands for importing a topic:

# Option #1: Manage multiple Kafka clusters in the same Terraform workspace
$ export IMPORT_KAFKA_API_KEY="<kafka_api_key>"
$ export IMPORT_KAFKA_API_SECRET="<kafka_api_secret>"
$ export IMPORT_KAFKA_REST_ENDPOINT="<kafka_rest_endpoint>"
$ terraform import confluent_kafka_topic.my_topic lkc-abc123/orders-123

I'm also a bit confused why even partitions_count = 8 -> 6 was updated 🤔

FireDrunk commented 4 months ago

Migration in this scenario means updating the status field to promoted and waiting for the topic to reach the Promoted status in the cluster link overview. After that (to my understanding) it's just a regular topic on the Kafka Cluster.

I can confirm we are using exactly those commands to import the topic.

That last change is probably due to the fact that we did not specify the config {} block matching the old topic that was imported, that's our fault ;)

FireDrunk commented 4 months ago

Update: When removing the 'rest_endpoint' variable from the new resource it seems to work, but that's not our setup, since we do not want to provide any credentials or endpoints via Provider configuration, but retrieve those from Azure Keyvault.

That is unsupported by Terraform for Provider credentials.

linouk23 commented 4 months ago

Gotcha, it seems like the question could be reduced to

how do I import an existing Kafka topic such that the target TF config is something like (without credentials block and rest_endpoint attribute):


resource "confluent_kafka_topic" "topic" {
config           = {
"cleanup.policy"                      = "delete"
"delete.retention.ms"                 = "86400000"
"max.compaction.lag.ms"               = "9223372036854775807"
"max.message.bytes"                   = "1000000"
"message.timestamp.difference.max.ms" = "9223372036854775807"
"message.timestamp.type"              = "CreateTime"
"min.compaction.lag.ms"               = "0"
"retention.bytes"                     = "-1"
"retention.ms"                        = "604800000"
}
id               = "lkc-<id>/dummy-topic"
partitions_count = 8
topic_name       = "dummy-topic"
kafka_cluster {
    id = "lkc-<id>"
}

}


Seems like you might want to follow [Option 2](https://registry.terraform.io/providers/confluentinc/confluent/latest/docs/resources/confluent_kafka_topic#option-2-manage-a-single-kafka-cluster-in-the-same-terraform-workspace):

provider "confluent" { kafka_id = var.kafka_id # optionally use KAFKA_ID env var kafka_rest_endpoint = var.kafka_rest_endpoint # optionally use KAFKA_REST_ENDPOINT env var kafka_api_key = var.kafka_api_key # optionally use KAFKA_API_KEY env var kafka_api_secret = var.kafka_api_secret # optionally use KAFKA_API_SECRET env var }

resource "confluent_kafka_topic" "orders" {

you don't even need to specify kafka_id block here as well

partitions_count = 8
topic_name       = "dummy-topic"
config           = {
    "cleanup.policy"                      = "delete"
    "delete.retention.ms"                 = "86400000"
    "max.compaction.lag.ms"               = "9223372036854775807"
    "max.message.bytes"                   = "1000000"
    "message.timestamp.difference.max.ms" = "9223372036854775807"
    "message.timestamp.type"              = "CreateTime"
    "min.compaction.lag.ms"               = "0"
    "retention.bytes"                     = "-1"
    "retention.ms"                        = "604800000"
}

}


and use the following [import commands](https://registry.terraform.io/providers/confluentinc/confluent/latest/docs/resources/confluent_kafka_topic#option-2-manage-a-single-kafka-cluster-in-the-same-terraform-workspace) that will leverage credentials from your `provider` block:

Option #2: Manage a single Kafka cluster in the same Terraform workspace

$ terraform import confluent_kafka_topic.my_topic lkc-abc123/orders-123



@FireDrunk let us know whether it helps!
FireDrunk commented 4 months ago

Ah, that means we have to move our Azure Keyvault logic from Terraform to our CICD, which is acceptable but preferred.

Our design was built on the idea that we might have to manage multiple Confluent Clusters in 1 run, but currently it's just one, so it's a workaround that works for now.

Thanks for your quick response!

linouk23 commented 4 months ago

Thanks for your quick response!

Thank you!

Our design was built on the idea that we might have to manage multiple Confluent Clusters in 1 run, but currently it's just one, so it's a workaround that works for now.

Feel free to use different folders for TF configuration files / modules for other clusters that'd still allow your team to reuse Option 2 with

provider "confluent" {
  kafka_id            = var.kafka_id                   # optionally use KAFKA_ID env var
  kafka_rest_endpoint = var.kafka_rest_endpoint        # optionally use KAFKA_REST_ENDPOINT env var
  kafka_api_key       = var.kafka_api_key              # optionally use KAFKA_API_KEY env var
  kafka_api_secret    = var.kafka_api_secret           # optionally use KAFKA_API_SECRET env var
}
FireDrunk commented 4 months ago

@linouk23 After reworking our setup, it still seems that importing is extremely buggy regarding the rest_endpoint:

Resource definition:

resource "confluent_kafka_mirror_topic" "topic" {
  source_kafka_topic {
    topic_name = var.topic_info.name
  }

  cluster_link {
    link_name = local.cluster_link_name
  }

  kafka_cluster {
    id            = data.confluent_kafka_cluster.shared-cluster.id
  }

  mirror_topic_name = var.topic_info.name

  status = var.topic_info.mirror_info.status

  lifecycle {
    prevent_destroy = true

    precondition {
      condition     = contains(["ACTIVE", "PAUSED", "PROMOTED", "FAILED_OVER"], var.topic_info.mirror_info.status)
      error_message = "Invalid Status field, valid options are: ACTIVE, PAUSED, PROMOTED and FAILED_OVER"
    }
  }
}

Environment vars:

❯ env | grep -i kafka
IMPORT_KAFKA_API_KEY=<key>
IMPORT_KAFKA_REST_ENDPOINT=https://lkc-<id>-pldo48.westeurope.azure.glb.confluent.cloud:443
KAFKA_API_SECRET=<secret>
KAFKA_REST_ENDPOINT=https://lkc-<id>-pldo48.westeurope.azure.glb.confluent.cloud:443
KAFKA_API_KEY=<key>
IMPORT_KAFKA_API_SECRET=<secret>

Importing this mirror_topic using the following command:

terraform import module.router.module.confluent_topics[\"dummy-topic\"].confluent_kafka_mirror_topic.topic lkc-<id>/default-accept-cluster-link/dummy-topic

Works:

module.router.module.confluent_topics["dummy-topic"].confluent_kafka_mirror_topic.topic: Import prepared!
  Prepared confluent_kafka_mirror_topic for import
---
Import successful!

The resources that were imported are shown above. These resources are now in
your Terraform state and will henceforth be managed by Terraform.

After running a new plan:

  # module.router.module.confluent_topics["dummy-topic"].confluent_kafka_mirror_topic.topic must be replaced
-/+ resource "confluent_kafka_mirror_topic" "topic" {
      ~ id                = "lkc-<id>/default-accept-cluster-link/dummy-topic" -> (known after apply)
        # (2 unchanged attributes hidden)

      ~ kafka_cluster {
            id            = "lkc-<id>"
          - rest_endpoint = "https://lkc-<id>-pldo48.westeurope.azure.glb.confluent.cloud:443" -> null # forces replacement

          - credentials {
              - key    = (sensitive value) -> null
              - secret = (sensitive value) -> null
            }
        }

        # (2 unchanged blocks hidden)
    }

It seems that the mirror_topic resource does add the rest_endpoint forcefully to the state, but our terraform cod doesn't, so it wants to replace the resource (which is bad!).

linouk23 commented 4 months ago

👋 @FireDrunk could you update your TF definition to include missing rest_endpoint & credentials attributes:

resource "confluent_kafka_mirror_topic" "example" {
  source_kafka_topic {
    topic_name = "orders"
  }
  cluster_link {
    link_name = confluent_cluster_link.source-outbound.link_name
  }
  kafka_cluster {
    id            = data.confluent_kafka_cluster.destination.id
    rest_endpoint = data.confluent_kafka_cluster.destination.rest_endpoint
    credentials {
      key    = confluent_api_key.app-manager-destination-cluster-api-key.id
      secret = confluent_api_key.app-manager-destination-cluster-api-key.secret
    }
  }

  lifecycle {
    prevent_destroy = true
  }
}

Namely,

rest_endpoint = data.confluent_kafka_cluster.destination.rest_endpoint
    credentials {
      key    = confluent_api_key.app-manager-destination-cluster-api-key.id
      secret = confluent_api_key.app-manager-destination-cluster-api-key.secret
    }

as specified in the docs and try again? Thank you!

FireDrunk commented 4 months ago

@linouk23 Ok, so:

Adding the credentials and rest_endpoint block to the confluent_kafka_mirror_topic resource works for importing the mirror_topic. After which running terraform apply still seems to work (I can promote the topic).

Omitting the credentials and rest_endpoint block on the regular confluent_kafka_topic does work when importing that after promoting it.

This seems a bit counter-intuitive... But technically it seems to work...

Adding the credentials and rest_endpoint after (succefully) importing the resource, still taints the resource.

linouk23 commented 4 months ago

@FireDrunk, it might be important to mention that a user is not supposed to alter the resource definition after importing it.

In other words, as specified in the docs, confluent_kafka_topic supports both options with different import commands and you might want to select either Option 1 or Option 2 when importing a topic and switching between them isn't supported after the import (unless you'll do terraform state rm confluent_kafka_topic.example and reimport a topic again using the other option).

Let us know if that helps and feel free to reopen this issue / create a new issue for follow up questions!