dbt-labs / terraform-provider-dbtcloud

dbt Cloud Terraform Provider
https://registry.terraform.io/providers/dbt-labs/dbtcloud
MIT License
85 stars 19 forks source link

Circular dependency: project + connection + repository #67

Closed felipefrancisco closed 2 years ago

felipefrancisco commented 2 years ago

Hey @GtheSheep - thank you very much, this repository is very helpful.

I'm trying to create a project, a connection and a repository using the resources from this provider, without having to take manual steps via the UI. My setup currently looks like this:

data "dbt_cloud_project" "default_project" {
  project_id = dbt_cloud_project.project.id
}

resource "dbt_cloud_project" "project" {
  name            = "My project"

  connection_id    = dbt_cloud_connection.snowflake_connection.connection_id
  repository_id     = dbt_cloud_repository.project_vcs.repository_id
}

resource "dbt_cloud_connection" "snowflake_connection" {
  project_id    = data.dbt_cloud_project.default_project.project_id

  type        = "snowflake"
  name        = "snowflake_connection_${var.environment}"
  role        = "n/a"

  account           = var.snowflake_account
  warehouse         = var.snowflake_warehouse
  database          = var.snowflake_dbt_database

  allow_keep_alive  = true
  is_active         = true
}

resource "dbt_cloud_repository" "project_vcs" {
  project_id    = data.dbt_cloud_project.default_project.project_id

  remote_url = "<redacted>"
  is_active = true
}

When I plan, unfortunately, I get this error:

╷
│ Error: Cycle: module.dbt_cloud_staging.dbt_cloud_repository.project_vcs, module.dbt_cloud_staging.dbt_cloud_connection.snowflake_connection, module.dbt_cloud_staging.dbt_cloud_project.project, module.dbt_cloud_staging.data.dbt_cloud_project.default_project
│ 
│ 
╵

Am I right to understand that even though the project has support for a repository and connection, we cannot use them?

GtheSheep commented 2 years ago

Hey, this looks strange, really shouldn't be a cyclical issue here as Project -> Connection -> Repo is the standard startup flow for dbt Cloud. Is there a reason for re-importing the project you create using a data_source rather than just using the project defined as a resource directly? Maybe I'm misunderstanding your setup here. Thanks

felipefrancisco commented 2 years ago

Thank you @GtheSheep , this is also strange to me as I've always seen the dependency as Project -> Connection -> Repo, like you've mentioned. There's no reason for reimporting, I can use it directly without the data source but the problem is still the same:

resource "dbt_cloud_project" "project" {
  name            = "${title(var.environment)}"

  connection_id = dbt_cloud_connection.snowflake_connection.connection_id
  repository_id = dbt_cloud_repository.project_vcs.repository_id
}

resource "dbt_cloud_connection" "snowflake_connection" {
  project_id    = dbt_cloud_project.project.id

  type              = "snowflake"
  name              = "snowflake_connection_${var.environment}"
  role              = "n/a"

  account           = var.snowflake_account
  warehouse         = var.snowflake_warehouse
  database          = var.snowflake_dbt_database

  allow_keep_alive  = true
  is_active         = true
}

resource "dbt_cloud_repository" "project_vcs" {
  project_id    = dbt_cloud_project.project.id

  remote_url = "<redacted>"
  is_active = true
}

The error changes to this, which is pretty much the same:

╷
│ Error: Cycle: module.dbt_cloud_staging.dbt_cloud_connection.snowflake_connection, module.dbt_cloud_staging.dbt_cloud_project.project, module.dbt_cloud_staging.dbt_cloud_repository.project_vcs
│ 
│ 
╵

I think it comes down to the fact that:

Unfortunately I don't see how this could work without causing the circular dependency given that project_id is set as a required property of both the connection and the repository? :((

GtheSheep commented 2 years ago

Ahh ok, that makes sense, so thinking about it, the flow in dbt Cloud is probably something like: Create Project Create Connection -> Update project Create Repository -> Update project

And hence the issue is that this provider doesn't mimic that flow when it's all new (essentially, 3 creates), so you'd have to create them all first, then run TF again with the connection_id and repository_id added to the project resource, which is probably the workaround? Create Project/ Connection/ Repository Update project

I'll have a think about how I can structure the project resource to capture this, don't have a fix off the top of my head 😅 But thank you for finding this, great spot!

felipefrancisco commented 2 years ago

Thank you for clarifying 👍🏽

GtheSheep commented 2 years ago

Hey, @felipefrancisco - so I've had a bit of a look around about this kind of thing, seems like there's a few examples + a discussion here. Seems like one example of a fix is to design the relationships between project <-> repo and project <-> connection as separate resources such that Terraform orders them as we want, i.e. resource dbt_cloud_project_repository and resource dbt_cloud_project_connection. As much as this is maybe not as intuitive or clean as it is now, removing the circular dependency for the benefit of being able to build infra in a single apply seems worth doing, so I'll likely try implementing an example of it sometime this week and adding your example as a test case

felipefrancisco commented 2 years ago

Amazing, thank you for looking further into this @GtheSheep! Much appreciated! Given the project's repo and the project's connection can only be linked after the Project has been created, it would indeed make sense to model them as separate resources that would link to the project 👍🏽

GtheSheep commented 2 years ago

Hey @felipefrancisco - Just releasing a version with the 2 new resources dbt_cloud_project_connection and dbt_cloud_project_repository, will follow up with an example on spinning up all resources described above, but should've solved the circular issue now 🤞 thanks again for finding this! (I'll reopen this issue if needed)