yandex-cloud / terraform-provider-yandex

Terraform Yandex provider
https://www.terraform.io/docs/providers/yandex/
Mozilla Public License 2.0
213 stars 116 forks source link

Dynamic managing of clickhouse hosts works incorrectly when there are ZOOKEEPER hosts #320

Open petrenko-hilbert opened 1 year ago

petrenko-hilbert commented 1 year ago

Hello there!

Using version 0.83.0

We're building a terragrunt module, so we implement dynamic "host" blocks in it - one for CLICKHOUSE hosts, another for ZOOKEEPER if there would be need in it

  dynamic "host" {
    for_each = var.clickhouse_hosts
    content {
      type       = "CLICKHOUSE"
      shard_name = host.value["shard_name"]
      zone       = host.value["zone"]
      subnet_id  = host.value["subnet_id"]
    }
  }

  dynamic "host" {
    for_each = var.zookeeper_hosts
    content {
      type      = "ZOOKEEPER"
      zone      = host.value["zone"]
      subnet_id = host.value["subnet_id"]
    }
  }

Variables look like this

variable "zookeeper_hosts" {
  type        = list(map(string))
  description = "Describe each host of cluster"
  default     = []
}

variable "clickhouse_hosts" {
  type        = list(map(string))
  description = "Describe each host of cluster"
  default     = []
}

So Terragrunt inputs look like this

  clickhouse_hosts = [
    {
      shard_name = "shard1"
      zone       = "ru-central1-a"
      subnet_id  = dependency.network.outputs.prod-compute-a
    },
    {
      shard_name = "shard2"
      zone       = "ru-central1-a"
      subnet_id  = dependency.network.outputs.prod-compute-a
    },
    {
     shard_name = "shard2"
     zone       = "ru-central1-a"
     subnet_id  = dependency.network.outputs.prod-compute-a
    }   
  ]

    zookeeper_hosts = [
      {
        zone      = "ru-central1-a"
        subnet_id = dependency.network.outputs.prod-compute-a
      },
      {
        zone      = "ru-central1-b"
        subnet_id = dependency.network.outputs.prod-compute-b
      },
      {
        zone      = "ru-central1-b"
        subnet_id = dependency.network.outputs.prod-compute-b
      }
    ]

So we have two shards and three hosts for CLICKHOUSE and mandatory three ZOOKEEPER hosts. The problem starts when we want to delete one of hosts fron the "shard2" We get the plan which tries to CHANGE one CLICKHOUSE host into ZOOKEEPER

  # yandex_mdb_clickhouse_cluster.clickhouse_cluster will be updated in-place
  ~ resource "yandex_mdb_clickhouse_cluster" "clickhouse_cluster" {
        id                       = "c9q*****0jm9"
        name                     = "clickhouse-test-sql"
        # (17 unchanged attributes hidden)

      ~ host {
          ~ subnet_id        = "e9b****p3" -> "e2lb****odq"
          ~ type             = "CLICKHOUSE" -> "ZOOKEEPER"
          ~ zone             = "ru-central1-a" -> "ru-central1-b"
            # (3 unchanged attributes hidden)
        }
      - host {
          - assign_public_ip = false -> null
          - fqdn             = "rc1b-9qj****r3s.mdb.yandexcloud.net" -> null
          - subnet_id        = "e2l****dq" -> null
          - type             = "ZOOKEEPER" -> null
          - zone             = "ru-central1-b" -> null
        }
      - host {
          - assign_public_ip = false -> null
          - fqdn             = "rc1b-9a7****bqe1.mdb.yandexcloud.net" -> null
          - subnet_id        = "e2l****dq" -> null
          - type             = "ZOOKEEPER" -> null
          - zone             = "ru-central1-b" -> null
        }
      - host {
          - assign_public_ip = false -> null
          - fqdn             = "rc1b-kso*****1kbf.mdb.yandexcloud.net" -> null
          - shard_name       = "shard2" -> null
          - subnet_id        = "e2l****8m" -> null
          - type             = "CLICKHOUSE" -> null
          - zone             = "ru-central1-b" -> null
        }

      ~ user {
          # At least one attribute in this block is (or was) sensitive,
          # so its contents will not be displayed.
        }

        # (49 unchanged blocks hidden)
    }

Plan: 0 to add, 1 to change, 0 to destroy.

If we try to apply that plan, we get an error

│ Error: ZooKeeper hosts cannot have a 'shard_name'
│ 
│   with yandex_mdb_clickhouse_cluster.clickhouse_cluster,
│   on main.tf line 29, in resource "yandex_mdb_clickhouse_cluster" "clickhouse_cluster":
│   29: resource "yandex_mdb_clickhouse_cluster" "clickhouse_cluster" {

I believe the reason is that both host types use the same resource host {} block, so resulting changes would really look like modifying existing host, even if it changes its type

Reducing number of hosts wouldn't seem as everyday nessecity, but such task still can exist. I'd suggest implementing different blocks for describing CLICKHOUSE and ZOOKEEPER hosts, for example clickhouse_host {} and zookeeper_host {} (optional)

Meanwhile I'd be glad to hear any workaround suggestions

qdeee commented 1 year ago

I faced with the same issue for MongoDB sharded cluster. There are 3 kinds of hosts: mongos, mongocfg, mongod. And any attempt to remove some host in the middle of host list comes to plan with global hosts "shuffling".

holycheater commented 1 year ago

It is a problem. Also, importing state through terraform import does not import zookeeper hosts. And the shards are ordered by network zone, not by shard number which makes the diff very large and makes my eye twitch