hashicorp / terraform-provider-docker

As part of our introduction to self-service publishing in the Terraform Registry, this copy of the provider has been archived, and ownership has been transferred to active maintainers in the community. Please see the new location on the Terraform Registry: https://registry.terraform.io/providers/kreuzwerker/docker/latest
https://registry.terraform.io/providers/kreuzwerker/docker/latest
Mozilla Public License 2.0
131 stars 92 forks source link

Make docker label support compatible with HCL 2 syntax #214

Closed xanderflood closed 4 years ago

xanderflood commented 4 years ago

The current mode of support for labels seems to be incompatible with HCL syntax. The examples I see about make it appear that this used to work, so I assume that this is related to the switch over to HCL2 in newer Terraform versions. I don't believe there's any non-breaking way to resolve this issue.

Terraform Version

Terraform v0.12.13
+ provider.docker v2.5.0
+ provider.postgresql v1.3.0

Affected Resource(s)

docker_service, among others

Terraform Configuration Files

resource "docker_service" "scrapedumper_reaper" {
  name = "scrapedumper_reaper"

  task_spec {
    container_spec {
      image = "smartatransit/scrapedumper-reaper:production"

      labels {
        swarm.cronjob.enable   = "true"
        swarm.cronjob.schedule = "0 3 * * *"
      }
    }
  }
}

Debug Output

Error: Argument or block definition required

  on scrapedumper.tf line 92, in resource "docker_service" "scrapedumper_reaper":
  92:         swarm.cronjob.enable   = "true"

An argument or block definition is required here. To set an argument, use the
equals sign "=" to introduce the argument value.

Error: Argument or block definition required

  on scrapedumper.tf line 93, in resource "docker_service" "scrapedumper_reaper":
  93:         swarm.cronjob.schedule = "0 3 * * *"

An argument or block definition is required here. To set an argument, use the
equals sign "=" to introduce the argument value.

Actual Behavior

Docker labels are currently represented as a terraform map. Docker label names commonly include the . character, and I see examples in several other issues on this repository (like the one I presented above) of people using the docker provider to attach these labels to containers.

I'm quite confused by this, because as far as I can tell the above is an HCL syntax error. You simply can't have a key that contains a . character. I'm not sure if this is something that changed in the TF0.12/HCL2.0 switch, but it appears that the terraform map is just not an acceptable data structure to represent docker labels.

Am I misunderstanding something about the examples I've seen, or does this really require a change to the docker_service schema to work properly? If I'm understanding the situation correctly, I'd be happy to do the leg work for this (I'm thinking that a set or list of blocks like

{
  key = "my.label.key"
  value = "my-label-value"
}

would be appropriate, but I'm open to other suggestions.)

Steps to Reproduce

  1. terraform init using the above code snippet will result in a syntax error. (It's not necessary to configure the provider to see the error.)

References

Are there any other GitHub issues (open or closed) or Pull Requests that should be linked here? For example:

xanderflood commented 4 years ago

I just tested it out with terraform 0.11.14 and confirmed that it works fine, so I renamed the issue to reflect that it's related to HCL2.

This document (which links in turn to this one) lays out the rules for attribute names in HCL2.

I think the main question here is whether there's a way to provide an alternative way to provide docker labels without invalidating the old way. I'm not familiar with the CustomDiff functionality but I wonder if it can be used to produce consistent diffs regardless of which syntax was used.

That said, anyone using dots in their labels with the old syntax will have their code break when they upgrade to terraform 0.12, and there is obviously no way to avoid that since it happens at the syntax level. Maybe it's better to just bite the bullet and eliminate the old syntax?

xanderflood commented 4 years ago

https://www.terraform.io/docs/configuration/syntax.html#identifiers <-- here's a much clearer and more concise description of what's an acceptable attribute name.

I started working on the change and I think it's all clear on the docker_service resource but there's a lot of other refactoring that needs done. Here's what it would ultimately look like:

  name = "scrapedumper_reaper"

  task_spec {
    container_spec {
      image = "smartatransit/scrapedumper-reaper:production"

      labels {
        label = "swarm.cronjob.enable"
        value = "true"
      }

      labels {
        label = "swarm.cronjob.schedule"
        value = "0 3 * * *"
      }
    }
  }
}
mavogel commented 4 years ago

Thank you for working on the enhancement. Let me know if you need any support

xanderflood commented 4 years ago

@mavogel I think it's nearly ready but I'm having trouble with a couple of test cases and I'd love another pair of eyes when you can. If you look at the first failed test case in this build (named TestAccDockerSecret_labels), it's complaining that docker_secret.foo doesn't have a labels field. The relevant code from my PR is here.

I have no idea if the way I'm checking equality for a set of maps is going to work, but it's failing before it gets there because it can't find the labels field, and I'm not sure why. Could this be because the labels aren't being added to the state correctly, or is it an issue with the resource schema?

xanderflood commented 4 years ago

I understand what I was doing wrong now, but I'm not quite sure of the best way to proceed.

The Terraform docs for the Set type are pretty sparse, but it seems like there's no way to access the elements of a set in an expression, so I would have to remove those tests if I stick to Set. Alternatively, I could use a Terraform List, but I'm not clear on whether the order of labels on Docker resources is guaranteed to be preserved, and if it isn't, you'd wind up with nonzero plans when nothing has really changed.

@mavogel do you have an opinion on whether set or list is more appropriate here?

mavogel commented 4 years ago

hey @xanderflood , sets are for not having duplicate entries, lists can have them. Well it feels like the docker api can handle duplicate, so I'd go with lists.

xanderflood commented 4 years ago

@mavogel are you sure about that? The existing terraform implementation uses a map, which doesn’t support duplicates, and the YAML syntax for swarm/docker config also uses a map to describe labels. here it looks like the golang SDK also uses a map.

I did come up with a way to test set values in this other toy draft PR I opened for experimentation (https://github.com/terraform-providers/terraform-provider-docker/pull/217/files - jump to line 100 of resource_docker_secret_test.go to see the entrypoint). It certainly works and allows you to recover and inspect the old map[string]string representation, but it’s a little unwieldy so I wouldn’t want to pollute your codebase with it without running it by you.

xanderflood commented 4 years ago

In fact, the fact that the golang SDK uses a map primitive (and golang maps are deliberately randomized) means that it’s guaranteed that it’ll return the labels in an unpredictable order, so I think using a list would tend to produce erroneous plans.

xanderflood commented 4 years ago

Oh, one other possibility would be to use a List but implement a CustomDiff hook that resolves it properly when the labels come back in a scrambled order. I’m open to that, but leveraging the built-in set type feels more approproate to me, personally

xanderflood commented 4 years ago

I've been moving forward with the set-of-structures design, and I'm going to include the modified unit tests. The next issue is making sure we implement state migration, which is going to be straightforward for all the affected resources except the container.

The issue with container is that there's already a v0-to-v1 migration implemented using the MigrateState callback, which is now deprecated. The terraform SDK docs recommend keeping any existing MigrateState callbacks, but doing new migrations using StateUpgraders. I'm not sure if the authors realized this when they wrote that, but taking that approach will not work properly for anybody who upgrades from v0 to v2. If for some reason a user had not upgraded their Docker provider in 9 months (which is roughly when the container v1 schema migration was introduced) this approach would break their statefile: TF would see a v0 state and opt to use only the MigrateState callback, which will produce a v1 instance state, but incorrectly mark it as v2, since that is the "current" version. Patching the MigrateState callback to properly bring us to v2 will be awkward, since the data structure yielded to MigrateState is radically different from the data structure yielded to a StateUpgradeFunc.

If we're comfortable breaking for the (hopefully small) number of users who held off upgrading the provider too long, we can take the recommended approach. Alternatively, we can refactor the MigrateState callback into a StateUpgrader, so that the two separate StateUpgraders can work properly together regardless of the source version.

mavogel commented 4 years ago

If we're comfortable breaking for the (hopefully small) number of users who held off upgrading the provider too long, we can take the recommended approach.

I will get feedback on this, because I am also not sure if #196 will have BCs. Anyway I'd like to get rid of the dirty hack in the container resource (see #138). I keep you updated due to the feedback

xanderflood commented 4 years ago

@mavogel apologies, you can ignore that concern, I took a closer look at the source code and it looks like I misunderstood how this works. It'll apply MigrateState and assume that the resulting version matches the input for first StateUpgrader, so everything should work properly without changing the existing migration 👍

xanderflood commented 4 years ago

https://github.com/terraform-providers/terraform-provider-docker/pull/217 this should be good for review now

xanderflood commented 4 years ago

@mavogel I haven't been able to find any reference to this in the TF issues, but it appears that under the 2.6 provider and the latest version of TF 0.12, the old syntax works again as long as the arguments are quoted, which raises the question of whether it's best to revert this change before releasing 2.7.