Allow providers to assign type hints to attributes

radeksimko commented 4 years ago

Problem Statement (current situation)

Providers can today specify primitive types, such as TypeString or TypeInt to fields which store more complex types. Different providers take different approaches to store complex types, but these are common situations:

XML (TypeString)
YAML (TypeString)
JSON (TypeString)
date (unix time as TypeInt or RFC3339/ISO8601 timestamp as TypeString)
base64 representation of a string (TypeString)

A few problems arise with such "dummy" representation across all providers as described below.

Validation

While providers (provider developers) are aware that certain fields are XML/YAML/JSON and should never contain arbitrary string, it is often not trivial or not obvious how to validate that. We have some prior art in that area:

but no consistent story.

As a result providers often rely on server-side API validation and end up making unnecessary round trips with invalid data that could have been caught at plan-time, or implement their own validation and effectively duplicate work that was already done & tested in other provider(s).

Diffing

Terraform 0.12 applies simple heuristics to render JSON differences in a human-readable way, but this is currently only implemented for JSON.

As a result differences of such complex-typed fields such as YAML are not human readable as these are just rendered as single line of text, e.g.

      ~ value           = 
          - "replicaCount: 3\n\nimage:\n  repository: basisai/consul-esm\n  tag: 0.3.3\n\nresources: {\"limits\":{\"memory\":\"256Mi\"},\"requests\":{\"cpu\":\"200m\"}}\n  # We usually recommend not to specify default resources and to leave this as a conscious\n  # choice for the user. This also increases chances charts run on environments with little\n  # resources, such as Minikube. If you do want to specify resources, uncomment the following\n  # lines, adjust them as necessary, and remove the curly braces after 'resources:'.\n  # limits:\n  #   cpu: 100m\n  #   memory: 128Mi\n  # requests:\n  #   cpu: 100m\n  #   memory: 128Mi\n\nenv: [{\"name\":\"HOST_IP\",\"valueFrom\":{\"fieldRef\":{\"fieldPath\":\"status.hostIP\"}}},{\"name\":\"CONSUL_HTTP_ADDR\",\"value\":\"$(HOST_IP):8500\"}]\n\ninitContainerSetSysCtl: 0\n\nconfig:\n  logLevel: \"INFO\"\n\n  # The service name for this agent to use when registering itself with Consul.\n  serviceName: \"consul-esm\"\n\n  # The service tag for this agent to use when registering itself with Consul.\n  # ESM instances that share a service name/tag combination will have the work\n  # of running health checks and pings for any external nodes in the catalog\n  # divided evenly amongst themselves.\n  serviceTag: \"\"\n\n  # The directory in the Consul KV store to use for storing runtime data.\n  kvPath: \"consul-esm/\"\n\n  # The node metadata values used for the ESM to qualify a node in the catalog\n  # as an \"external node\".\n\n  externalNodeMeta: {\"external-node\":\"true\"}\n  # The length of time to wait before reaping an external node due to failed\n  # pings.\n  nodeReconnectTimeout: \"72h\"\n\n  # The interval to ping and update coordinates for external nodes that have\n  # 'external-probe' set to true. By default, ESM will attempt to ping and\n  # update the coordinates for all nodes it is watching every 10 seconds.\n  nodeProbeInterval: \"10s\"\n\n  # The address of the local Consul agent. Can also be provided through the\n  # CONSUL_HTTP_ADDR environment variable.\n  httpAddr: \"\"\n\n  # The method to use for pinging external nodes. Defaults to \"udp\" but can\n  # also be set to \"socket\" to use ICMP (which requires root privileges).\n  pingType: \"udp\"\n",
          + "replicaCount: 3\n\nimage:\n  repository: basisai/consul-esm\n  tag: 0.3.3\n\nresources: {\"limits\":{\"memory\":\"256Mi\"},\"requests\":{\"cpu\":\"200m\"}}\n  # We usually recommend not to specify default resources and to leave this as a conscious\n  # choice for the user. This also increases chances charts run on environments with little\n  # resources, such as Minikube. If you do want to specify resources, uncomment the following\n  # lines, adjust them as necessary, and remove the curly braces after 'resources:'.\n  # limits:\n  #   cpu: 100m\n  #   memory: 128Mi\n  # requests:\n  #   cpu: 100m\n  #   memory: 128Mi\n\nenv: [{\"name\":\"HOST_IP\",\"valueFrom\":{\"fieldRef\":{\"fieldPath\":\"status.hostIP\"}}},{\"name\":\"CONSUL_HTTP_ADDR\",\"value\":\"$(HOST_IP):8500\"}]\n\ninitContainerSetSysCtl: false\n\nconfig:\n  logLevel: \"INFO\"\n\n  # The service name for this agent to use when registering itself with Consul.\n  serviceName: \"consul-esm\"\n\n  # The service tag for this agent to use when registering itself with Consul.\n  # ESM instances that share a service name/tag combination will have the work\n  # of running health checks and pings for any external nodes in the catalog\n  # divided evenly amongst themselves.\n  serviceTag: \"\"\n\n  # The directory in the Consul KV store to use for storing runtime data.\n  kvPath: \"consul-esm/\"\n\n  # The node metadata values used for the ESM to qualify a node in the catalog\n  # as an \"external node\".\n\n  externalNodeMeta: {\"external-node\":\"true\"}\n  # The length of time to wait before reaping an external node due to failed\n  # pings.\n  nodeReconnectTimeout: \"72h\"\n\n  # The interval to ping and update coordinates for external nodes that have\n  # 'external-probe' set to true. By default, ESM will attempt to ping and\n  # update the coordinates for all nodes it is watching every 10 seconds.\n  nodeProbeInterval: \"10s\"\n\n  # The address of the local Consul agent. Can also be provided through the\n  # CONSUL_HTTP_ADDR environment variable.\n  httpAddr: \"\"\n\n  # The method to use for pinging external nodes. Defaults to \"udp\" but can\n  # also be set to \"socket\" to use ICMP (which requires root privileges).\n  pingType: \"udp\"\n",

Additionally most (all?) providers want to suppress no-op diffs, such as whitespace changes. They use DiffSuppressFunc with custom implementations or structure.SuppressJsonDiff to achieve this. Again - we have some prior art but no consistent story here.

State

To ensure consistency and correct diff calculation provider needs to ensure these values are saved in their "canonical" form to the state, which is often achieved via StateFunc and we have some prior art in that area:

structure.NormalizeJsonString

but no consistent story.

Customisability

While many providers could be satisfied with simple "JSON" or "YAML" field, there are providers which likely require custom rules for diffing. Diffing of AWS IAM policies is one great example - AWS IAM API doesn't document any canonical format of policies and often treats things like "values": ["single"] and "values": "single" as equal and interchangeable.

Proposal

This will likely require full RFC and some discussion/scrutiny, but rough plan is below:

Probe official providers' codebases to understand what common data types do they use and what formats of dates do they use
[SDK] Create canonical basic validation functions for XML, YAML, date (common formats), base64
[tf core] Refactor to make diffing logic in command/format more easily customisable/pluggable
[tf core] Implement diffing logic for XML, YAML, date and base64
[SDK/core] Allow schema to hint back to core how to diff given field

Just for the sake of better search-ability here are some names other people have used when referring to this problem/feature: virtual types, diff hints, dynamic attribute diffing

Related for core - function parity for working with these types and converting.

paultyng commented 4 years ago

This is potentially related to (or superseded by support for) the pseudo dynamic type, #248.

radeksimko commented 4 years ago

@paultyng do you mind explaining the relationship a bit more? 🤔

I'm imagining that regardless of how the data is represented in state or config, we need to address the above main problems (validation & diffing) for each language (JSON, YAML, XML, ...) separately.

I suppose it can help with data representation in state, but we'd then still need to write some parsers which parse raw JSON/YAML/XML into internal cty's dynamic type, because I expect users will want to keep raw format in their configs, so the "dynamic" type really ends up being just an internal representation.

It might help users in accessing data inside the structure from CRUD as we'd pre-parse it for them I guess? I don't understand how that's related to this issue though.

paultyng commented 4 years ago

Yeah, I think it depends on how folks want to use those types of data. For example, instead of supporting a JSON or YAML attribute, one could do something like:

attribute = jsondecode(<<JSON
{
  "foo": "bar",
  "baz": 3
}
JSON

# or

attribute = yamldecode(<<YAML
foo: bar
baz: 3
YAML

# being equivalent to

attribute = {
  foo = "bar"
  baz = 3
}

And then diffing (and diff rendering), etc could all be handled via the dynamic type. If providers went in this direction for their support, they wouldn't need type hinting as it wouldn't matter, they would just set it as dynamic.

paultyng commented 4 years ago

This is of course about a subset of the type hinting you mention above, dates, base64, etc or other complex strings we'd want to retain as strings, still could use some type extensions on top of string.

radeksimko commented 4 years ago

@paultyng I see! That's a good point.

I reckon that users would still want to see it represented as the original format (JSON/YAML,...) in the plan output and everywhere else. It would be confusing if we just displayed HCL-style diffs for these fields.

i.e. from user's perspective this should be just a hidden implementation detail.

hashicorp / terraform-plugin-sdk