hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io/
Other
42.42k stars 9.51k forks source link

"terraform plan" output is long for resource types with complex schemas #21639

Closed rismoney closed 4 years ago

rismoney commented 5 years ago

The plan output is way, WAY too verbose. Using the vsphere provider. A single VM attribute change yields nearly 140 lines of json. We could have 100s VM's in a statefile, making a simple single attribute change yielding 14000+ lines of plan output. There is no way anyone can humanly review that. I'm not sure how this change came to be, but the rendering is horrific for complex environments.

Can there simply be an option to --show-onlydiffs on plan/apply? This would be analagous to what was previously presented on .11 and earlier.

The tool is largely unusable by operations teams at scale with this output as is. Let me know if you need additional information to facilitate a change.

apparentlymart commented 5 years ago

Hi @rismoney!

Could you show a real example of what you're seeing for a single attribute change on a VM?

This change happened in response to feedback that in prior versions the output didn't include enough context on the objects that were being changed, forcing people to then refer back to the configuration and compare it with the plan in order to understand the actual result.

Some resource types have particularly large configuration surfaces, and thus the result tends to now be very large for those, particularly while we're still using the old Terraform SDK that doesn't support the concept of null and thus effectively populates every single attribute in its schema with a default value, rather than leaving the unused ones unset.

We don't intend to add more options here, because options are always a last resort: Terraform should just do "the right thing" by default. I understand that what it's doing right now is not quite the right thing, and I expect it will continue to evolve, but to zero in on "the right thing" we'll need some real examples to look at.

One idea we've discussed (which is currently blocking on improvements to Terraform's SDK to allow it to better use the new features in the new 0.12 protocol) is to allow a provider to specify for a particular resource type a subset of its arguments that are "identifying", in the sense that they can be used to understand exactly which remote object the resource instance belongs to. For many resource types, that might just be id or name, but for others additional information is important/helpful. With such a feature in place, terraform plan could then show only the identifying arguments and the changed arguments, and elide everything else. That would be a compromise between the two extremes of the 0.11 and 0.12 behavior. The new SDK work is beginning imminently.

rismoney commented 5 years ago
# module.mypc.vsphere_virtual_machine.wks will be updated in-place
16:04:27   ~ resource "vsphere_virtual_machine" "wks" {
16:04:27         annotation                              = "TP_PACKER_COMMIT=82bd375"
16:04:27         boot_delay                              = 5000
16:04:27         boot_retry_delay                        = 10000
16:04:27         boot_retry_enabled                      = false
16:04:27         change_version                          = "2019-06-06T17:21:31.18578Z"
16:04:27         cpu_hot_add_enabled                     = false
16:04:27         cpu_hot_remove_enabled                  = false
16:04:27         cpu_limit                               = -1
16:04:27         cpu_performance_counters_enabled        = false
16:04:27         cpu_reservation                         = 0
16:04:27         cpu_share_count                         = 4000
16:04:27         cpu_share_level                         = "normal"
16:04:27       - custom_attributes                       = {} -> null
16:04:27         datastore_cluster_id                    = "group-1"
16:04:27         datastore_id                            = "datastore-1"
16:04:27         default_ip_address                      = "1.2.3.4"
16:04:27         efi_secure_boot_enabled                 = false
16:04:27         enable_disk_uuid                        = false
16:04:27         enable_logging                          = true
16:04:27         ept_rvi_mode                            = "automatic"
16:04:27         extra_config                            = {}
16:04:27         firmware                                = "efi"
16:04:27         folder                                  = "production"
16:04:27         force_power_off                         = true
16:04:27         guest_id                                = "windows9_64Guest"
16:04:27         guest_ip_addresses                      = [
16:04:27             "1.2.3.4",
16:04:27             "aaaa::aaaa:aaaa:aaaa:aaaa",
16:04:27         ]
16:04:27         host_system_id                          = "host-495"
16:04:27         hv_mode                                 = "hvAuto"
16:04:27         id                                      = "422dcf9d-a08f-fe11-b916-f0a09f763e03"
16:04:27         latency_sensitivity                     = "normal"
16:04:27         memory                                  = 12288
16:04:27         memory_hot_add_enabled                  = false
16:04:27         memory_limit                            = -1
16:04:27         memory_reservation                      = 0
16:04:27         memory_share_count                      = 122880
16:04:27         memory_share_level                      = "normal"
16:04:27         migrate_wait_timeout                    = 30
16:04:27         moid                                    = "vm-467"
16:04:27         name                                    = "aaaa"
16:04:27         nested_hv_enabled                       = true
16:04:27         num_cores_per_socket                    = 4
16:04:27         num_cpus                                = 4
16:04:27         reboot_required                         = false
16:04:27         resource_pool_id                        = "resgroup-92"
16:04:27         run_tools_scripts_after_power_on        = true
16:04:27         run_tools_scripts_after_resume          = true
16:04:27         run_tools_scripts_before_guest_reboot   = false
16:04:27         run_tools_scripts_before_guest_shutdown = true
16:04:27         run_tools_scripts_before_guest_standby  = true
16:04:27         scsi_bus_sharing                        = "noSharing"
16:04:27         scsi_controller_count                   = 1
16:04:27         scsi_type                               = "pvscsi"
16:04:27         shutdown_wait_timeout                   = 3
16:04:27         swap_placement_policy                   = "inherit"
16:04:27         sync_time_with_host                     = true
16:04:27       - tags                                    = [] -> null
16:04:27         uuid                                    = "422dcf9d-a08f-fe11-b916-f0a09f763e03"
16:04:27         vapp_transport                          = []
16:04:27         vmware_tools_status                     = "guestToolsRunning"
16:04:27         vmx_path                                = "aaaa/aaaa.vmx"
16:04:27         wait_for_guest_ip_timeout               = 0
16:04:27         wait_for_guest_net_routable             = true
16:04:27         wait_for_guest_net_timeout              = 0
16:04:27 
16:04:27       ~ clone {
16:04:27             linked_clone  = false
16:04:27             template_uuid = "422da447-d9f0-e483-32a8-fdaace90a962"
16:04:27             timeout       = 30
16:04:27 
16:04:27           ~ customize {
16:04:27               - dns_server_list = [] -> null
16:04:27               - dns_suffix_list = [] -> null
16:04:27                 timeout         = 10
16:04:27 
16:04:27                 network_interface {
16:04:27                     dns_domain      = "x.corp"
16:04:27                     dns_server_list = [
16:04:27                         "1.2.3.4",
16:04:27                         "1.2.3.4",
16:04:27                     ]
16:04:27                     ipv4_netmask    = 0
16:04:27                     ipv6_netmask    = 0
16:04:27                 }
16:04:27 
16:04:27               ~ windows_options {
16:04:27                     auto_logon            = false
16:04:27                     auto_logon_count      = 1
16:04:27                     computer_name         = "blah"
16:04:27                     full_name             = "Administrator"
16:04:27                     organization_name     = "Managed by Terraform"
16:04:27                     product_key           = "aaa-aaa-aaa-aaa-aaa"
16:04:27                   - run_once_command_list = [] -> null
16:04:27                     time_zone             = 35
16:04:27                 }
16:04:27             }
16:04:27         }
16:04:27 
16:04:27         disk {
16:04:27             attach           = false
16:04:27             datastore_id     = "datastore-388"
16:04:27             device_address   = "scsi:0:0"
16:04:27             disk_mode        = "persistent"
16:04:27             disk_sharing     = "sharingNone"
16:04:27             eagerly_scrub    = false
16:04:27             io_limit         = -1
16:04:27             io_reservation   = 0
16:04:27             io_share_count   = 1000
16:04:27             io_share_level   = "normal"
16:04:27             keep_on_remove   = false
16:04:27             key              = 2000
16:04:27             label            = "aaaa.vmdk"
16:04:27             path             = "aaaa/aaaa.vmdk"
16:04:27             size             = 100
16:04:27             thin_provisioned = true
16:04:27             unit_number      = 0
16:04:27             uuid             = "6000C295-02e9-d974-d85e-1372f3ee3648"
16:04:27             write_through    = false
16:04:27         }
16:04:27 
16:04:27         network_interface {
16:04:27             adapter_type          = "vmxnet3"
16:04:27             bandwidth_limit       = -1
16:04:27             bandwidth_reservation = 0
16:04:27             bandwidth_share_count = 50
16:04:27             bandwidth_share_level = "normal"
16:04:27             device_address        = "pci:0:7"
16:04:27             key                   = 4000
16:04:27             mac_address           = "00:00:00:00:00:00"
16:04:27             network_id            = "dvportgroup-73"
16:04:27             use_static_mac        = false
16:04:27         }
16:04:27     }
rismoney commented 5 years ago

the subcontexts are great but this is far more legible, when looking at 100.

# module.mypc.vsphere_virtual_machine.wks will be updated in-place
16:04:27   ~ resource "vsphere_virtual_machine" "wks" {
16:04:27       - custom_attributes                       = {} -> null
16:04:27       - tags                                    = [] -> null
16:04:27       ~ clone {
16:04:27           ~ customize {
16:04:27               - dns_server_list = [] -> null
16:04:27               - dns_suffix_list = [] -> null
16:04:27               ~ windows_options {
16:04:27                   - run_once_command_list = [] -> null
16:04:27                 }
16:04:27             }
16:04:27         }
16:04:27     }
BadgerCode commented 5 years ago

We have been experiencing the same issue using the AzureRM provider. Terraform 0.12.2 AzureRM 1.29

Before 0.12, a single attribute change would result in very few, colour-coded lines which showed exactly what was changing. If I needed more context, I could refer to my configuration.

Now, a single attribute change results in a huge output with almost no colour-coding. There is seemingly no way to remove this extra information.

This has made reviewing changes much more challenging for us, making infrastructure changes riskier.

Examples

Deleting 7 resources before Terraform delete resource powershell before 0.12

Deleting 3 resources now (this spans multiple pages) terraform delete resource powershell 0.12


Changing a single attribute before terraform change resource powershell before 0.12

Changing a single attribute on two resources now terraform change resource powershell 0.12



Most of our changes feature single attributes changing across multiple resources, in combination with other attributes changing, resources added or resources deleted. These were easy to review before, but are very difficult to review now.

thtran101 commented 5 years ago

Ever since upgrading to TF 0.12.x, I've experienced the same difficulties that you've described with verbose output that makes it more difficult to review changes. I've looked back on many past issues #10507 regarding this topic and the know the TF team has to make difficult decisions about supporting features that won't meet everyone's needs/expectations.

One possible workaround; which is the route I took, is to write your own custom parser for the plan. You can use terraform show -json to generate a JSON formatted version of the plan. It will list all changes in the resource_changes attribute of root JSON object. You can iterate through that and filter out the noise.

I used to get output like so from TF 0.12.3

verbose-policy verbose-lambda

By maintaining a list of delta attributes for specific resource types I'm not concerned about, I can cut the noise down to something like this; which tells me "yes", these objects are changing, but there are no changes that need my attention, I'm simply deploying a new version of the code.

concise-policy concise-lambda

In my particular use case, I found TF issuing what I considered false negative change notifications, that wasn't occurring in the previous version of TF #9042. This wasn't caused by a change in AWS Provider but something to do with TF 0.12.x itself.

I could tell data was there to suggest that a change wasn't really going to occur so I just wrote a custom parser since it's just JSON.

Not aware of any thorough documentation on the JSON output. It may take some trial and error on your part, but it can be done, if it's important enough to you.

rismoney commented 5 years ago

A switch should be considered In the case of putting guardrails around safe usage and I tend to think this is warranted. Terraform should try to do it's best, to help signal to noise ratio. If you want verbosity you should get it, and the reverse is true too.

BadgerCode commented 5 years ago

@thtran101 is your custom parser in any kind of sharable state?

thtran101 commented 5 years ago

@BadgerCode, yeah it could use a bit of refactoring in certain areas but it's mostly broken up into logical and customizable pieces. I can't post at the moment, but will do so by end of night Pacific time.

The code's written for nodejs 10.x.

thtran101 commented 5 years ago

Hi @BadgerCode, I've created the public repo for the parser. If you need anything, DM or post a question in the repo.

BadgerCode commented 5 years ago

Thanks a lot @thtran101 !

rismoney commented 5 years ago

I do appreciate the code @thtran101 as a hackish work-around, but I think this issue should not get de-emphasized. The verbosity is the real problem and should be fixed in the core product. This should fall within the Hashicorp pragmatism in their Tao. We'll see how it plays out.

BadgerCode commented 5 years ago

This should be addressed in Terraform. I'm looking for a temporary workaround as almost every single terraform plan has me and my team asking "So what's changed?".

pcfleischer commented 4 years ago

just upgraded to 0.12.24 and experiencing the same issue, unfortunate this is considered a feature request since it's a change in behavior :(

andreitchaltsev commented 4 years ago

Another issue with the verbosity of terraform plan: there is way too much text with "Refreshing state...".

For a typical run we have about 1000 lines with "Refreshing state..." (see example below) which is roughly 90% of all the output. When applying Terraform in multiple modules turning over hundreds of pages of this text every single time incurs a very considerable burden without any benefits.

It would be very helpful to have a possibility to suppress that logs. For example, these messages can be changed to DEBUG verbosity level or a special option may be introduced to enable/disable them. Up to the team to decide the best approach.

Example output of terraform plan:

Terraform v0.12.26

Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.

module.role.module.read_write_role.data.terraform_remote_state.permission_boundary: Refreshing state...
module.role.module.read_role.data.terraform_remote_state.common_read_only: Refreshing state...
module.role.module.read_write_role.data.terraform_remote_state.common_read_only: Refreshing state...
module.role.module.read_role.data.terraform_remote_state.permission_boundary: Refreshing state...
<And so on, so on, so on...>
bondsbw commented 4 years ago

Version 0.13 adds all outputs to every plan. And it comes after the summary Plan: 0 to add, 1 to change, 0 to destroy. making it hard to even find whether there are any changes.

Please consider moving that line to the very end, and a flag to remove the outputs listing.

alisdair commented 4 years ago

Thanks to everyone who contributed to this discussion, especially for the many clear use cases for a shorter diff!

Our approach to addressing this problem was recently merged in #26187, and shipped today in an alpha prerelease. If you're able to do so, I would encourage you to try out the new diff renderer in the alpha build, and leave feedback on our discussion forum thread for this topic.

In brief, the approach we hope to release as part of 0.14.0:

We hope that this changed approach to rendering plans will solve the core problems discussed here, so I'm going to close this issue for now. If you have feedback on the new approach, please post in the discussion thread. Looking forward to reading your thoughts!

ghost commented 3 years ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.