hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io/
Other
41.77k stars 9.43k forks source link

Option to store the state minified #31337

Open moserke opened 2 years ago

moserke commented 2 years ago

Currently the terraform state is only ever stored as pretty printed json. Interaction with the state is always through terraform itself, it would be nice to have the option to store state in minified json. This not only would reduce the size of large state files, but there is also a secondary reason for asking. We've been working to POC exposing our state files through Amazon Athena out of S3 to allow us to have a rich query language across our state files, to do things like look for where a resource was defined.

The Athena to S3 integrations require that json documents be on their own lines in files and can not be multi-line. We were able to POC this working by minifying a bunch of state files ourselves out of band, but this is not a long term solution.

Would be nice if we could inform the terraform backend or state to write it minified.

crw commented 2 years ago

Hi @moserke, thanks for this enhancement request! It is important to note that the state file as written by Terraform is undocumented, and the dev team reserves the right to change the format arbitrarily in future releases (preserving the ability to parse the format that older versions generated, of course).

The reliable way to accomplish what you describe would be to use terraform show -json to generate an output file that is uploaded to S3, that is then read through Athena. I recognize that this is a lot of work when you have a "work-free" option sitting in S3 today, but just wanted to call this out as the "technically correct" path.

Thanks again for the request!

moserke commented 2 years ago

Thanks @crw. Definitely understood that the consumption of these would fall into the “could change without warning” camp but when you are talking about terraform states that number in the 100s and various ways they are applied (manually and multiple styles of pipelines) trying to inject essentially an out of band, post apply command consistently to guarantee upload would be just as large a potential to break.

I think fundamentally the ask is to be able to store minified which has the direct benefit of saving money and to a smaller extent an environmental impact as it takes less storage and therefore less energy to store.

From there I definitely support you pushing back on me saying my mileage may vary consuming them, but the request itself I think has merit. Especially when you multiply storage across your thousands of customers. And if customers should never be consuming the state files directly, then maybe even minified is the correct default for the reasons stated.

Just my musings and regardless, appreciate the time spent looking in to it