Open orgads opened 1 year ago
Hi @orgads, Before discussing the how, would you mind expanding on why?
Specifically I am curious on why and how often do you open *.tfstate
files?
I can understand why as a user you may end up editing *.tf.json
although even that should not be common as that's intended to be a machine-readable configuration.
Accessing a *.tfstate
file can imply a few possibilities, most of which are not good:
terraform state
subcommands, e.g. mv
or rm
)terraform state
subcommands, e.g. list
or show
)The last possibility is just about the only good reason to be accessing the *.tfstate
file directly but should be relatively rare, and even there I would still attempt to first use the state
subcommands.
Finally, the *.tfstate
JSON format is an internal format which is wholly managed by Terraform itself and not meant to be even accessed by other programs - certainly not ones which expect any level of stability. For those use cases there is terraform show -json
which has a stable and documented machine-readable state representation.
All that said, it is entirely possible I missed a use case, so do let us know if that's the case. Either way - before we jump to solutions, I think we should outline the problem well, so we know we are actually solving it.
Hi.
I use azure backend for the state files, but I often pull them for inspection.
When someone calls me to ask "what's the connection string to the database", I open the tfstate file and search for "connection_string" to find it. This is easier for me than remembering the full object name.
This feature is nice-to-have from my point of view, I totally understand if you consider it not worth the effort.
When someone calls me to ask "what's the connection string to the database", I open the tfstate file and search for "connection_string" to find it. This is easier for me than remembering the full object name.
Thanks for sharing the use case. That is helpful!
Is there a reason you prefer to pull the whole state file and read through the whole JSON, instead of calling terraform show
to inspect the human-readable state or to call terraform state show database_resource.foo
to inspect the human-readable state of that particular resource you're interested in?
Especially when it comes to something as sensitive as connection string, the command-based approach I outlined above reduces some risk. Assuming you store the state file remotely, it is only downloaded into memory and then printed to your terminal as part of running those commands, rather than stored as whole on your disk until/unless you remember to remove it. Assuming you also enable encryption at rest, then the encryption becomes less relevant if you regularly download the whole state file and leave it on disk, which may or may not be encrypted. It is also basically a definition of secret sprawl.
This feature is nice-to-have from my point of view, I totally understand if you consider it not worth the effort.
At this point I'm not trying to assess the amount of effort but the amount of value it would add and for whom - which is why I'm asking these questions. 😉
Just to take your use case more literally (and reflect on hard-to-remember resource names), you could run terraform show | grep connection_string
(or terraform show -json | grep connection_string
if it's sensitive) which should yield the same result as downloading the whole file to disk, opening it in an IDE, and risking the secret sprawl.
I'm assuming here also that the terminal scrollback/history gets eventually automatically deleted, unlike state files you download, so not only you never store things you don't care about (the rest of the state file) but the one sensitive thing you do download gets deleted, so the exposure is more minimal.
I see your point about secret sprawl, and I agree. I mostly use it for dev environments.
Drawbacks of terraform show:
jq .
for reformatting, but even with that - I have no context.For example, grep connection_string
gives me this:
"module.aks-deploy.kubernetes_secret.az_storage_connection_string",
"module.aks-deploy.kubernetes_secret.az_storage_connection_string",
"module.aks-deploy.kubernetes_secret.az_storage_connection_string",
"module.aks-deploy.kubernetes_secret.az_storage_connection_string",
"primary_blob_connection_string": "<real-connection-string> (of what?)",
"primary_connection_string": "<real-connection-string> (of what?)",
"secondary_blob_connection_string": "",
"secondary_connection_string": "<real-connection-string>",
"primary_blob_connection_string": true,
"primary_connection_string": true,
"secondary_blob_connection_string": true,
"secondary_connection_string": true,
Thanks again for further explaining the issues with the approach, I see how that can be annoying!
I would still hope that we (more collectively meaning HashiCorp/Terraform here, not just our team behind the extension) can come up with a solution which does not involve downloading the whole file to disk. Perhaps a terraform state search
which brings up some surrounding context along would be a hypothetical reasonable compromise? 🤔
On the slowness/performance note, I'm assuming the majority of time is spent downloading the file, which seems inevitable with the current model of the state being represented as a single file.
Either way, I'll leave this issue open and see if there's more interest (expressed via upvotes) or more ideas.
I'll also pass the feedback from your last comment to the product team as food for thought.
Thank you very much!
I ran a test with TF_LOG=trace, and actually downloading the file takes less than a second.
Here is the redacted trace: https://gist.github.com/orgads/bbafbc2c637e17582ea6dc1b4ef88f38
Notable timings: | Line | Duration |
---|---|---|
Using Obtaining a token from the Azure CLI for Authentication | 3 sec | |
providercache.fillMetaCache | 1.5 sec | |
Azure Backend Request (listKeys) | 1 sec | |
Azure Backend Request (file content) | 1 sec | |
providercache.fillMetaCache (again?) | 1.5 sec | |
starting plugin: path=.terraform/providers/registry.terraform.io/oboukili/argocd | 2 sec |
Why are all the providers initialized?
After sharing this with the wider product team, another relevant suggestion also emerged, which is that if you expect those connection strings to be consumable/important, then these should be declared as outputs.
For example, you can declare:
output "az_storage_connection_string" {
value = module.aks-deploy.kubernetes_secret.az_storage_connection_string
}
You can declare as many outputs as you want and the problem of "not knowing the exact name" is also solved by terraform output
which lists all outputs.
Yet another benefit of this approach is that this allows you to employ more fine-grained RBAC model with remote
state (or cloud
block) and also potentially solve the slowness issue, since Terraform Cloud can retain the whole (potentially big) state file which you should never need to download and you only end up accessing the outputs as they're stored separately from the state file and any secrets which you don't explicitly declare as outputs may remain secret and remain protected by RBAC.
State access:
No access: — No access is granted to the state file from the workspace.
Read outputs only: — Allows users to access values in the workspace's most recent Terraform state that have been explicitly marked as public outputs. Output values are often used as an interface between separate workspaces that manage loosely coupled collections of infrastructure, so their contents can be relevant to people who have no direct responsibility for the managed infrastructure but still indirectly use some of its functions. This permission is required to access the State Version Outputs API endpoint.
See https://developer.hashicorp.com/terraform/cloud-docs/users-teams-organizations/permissions for more
Thank you very much.
I'll add the frequently accessed values as outputs.
I wasn't aware of terraform output
(we do use outputs, but I never used this command). It looks useful, and takes 7 seconds, which is more or less the time it takes me to open the file and lookup the value :)
Extension Version
v2.28.2
Problem Statement
tfstate file is opened as JSON, which is correct.
But when I navigate through it, the outline (the line above the editor) shows generic JSON location. For example,
[] resources > {} 138 > [] instances > {} 0 > {} attributes
.Expected User Experience
It would be nice if the outline could show the terraform notation instead or in addition (copyable if possible). On this case,
module.my_mod.random_password.example[0]
Proposal
Maybe something like https://github.com/ChaunceyKiwi/json-tree-view can be done to address this.
References
No response
Help Wanted
Community Note