hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io/
Other
42.68k stars 9.55k forks source link

Storing sensitive values in state files #516

Open seanherron opened 10 years ago

seanherron commented 10 years ago

309 was the first change in Terraform that I could find that moved to store sensitive values in state files, in this case the password value for Amazon RDS. This was a bit of a surprise for me, as previously I've been sharing our state files publicly. I can't do that now, and feel pretty nervous about the idea of storing state files in version control at all (and definitely can't put them on github or anything).

If Terraform is going to store secrets, then some sort of field-level encryption should be built in as well. In the meantime, I'm going to change things around to use https://github.com/AGWA/git-crypt on sensitive files in my repos.

cfbao commented 3 years ago

I’m new to terraform and found this surprising. Is it possible for me to flag a property a write-once so I can just remove it from all state/files entirely after initial creation?

Yes, you can use the lifecycle.ignore_changes setting to specify what fields TF should ignore. https://www.terraform.io/docs/configuration/meta-arguments/lifecycle.html#ignore_changes -- Unfortunately, that doesn't solve the overall issue of having TF do something more with secrets than storing in plain text.

Specifying lifecycle.ignore_changes cannot remove the value from state file. The next time you run terraform apply for whatever other changes, the ignored attribute's value will still be read by terraform and written into the state file. It's only ignored for the purpose of generating a plan, but not ignored for the purpose of refreshing the state file.

ausmith commented 3 years ago

TIL about lifecycle.ignore_changes!! I'll edit my post accordingly.

Plasma commented 3 years ago

Thank you. I'm wondering if a product solution to this problem could be that a property is marked as "on create only, prompt for value, do not ever store in state".

In the case of a database resource being created, after a terraform apply the CLI would prompt me for this value in the console (so its never provided in any file or environment or anything), is used to initially create the resource, but is never written into a state file and never tracked for changes.

To me I'd rather not have the sensitive data in terraform at all.

ajbouh commented 3 years ago

One option here is to use a cloud native secret storage mechanism that's compatible with your database. For example you can configure RDS to pull creds from SSM parameter store directly. You can also ensure that the terraform IAM role does not have permission to decrypt these secrets.

This is still brittle and easy to get wrong but it might be better than nothing.

On Sat, Jan 23, 2021, 16:43 Andrew Armstrong notifications@github.com wrote:

Thank you. I'm wondering if a product solution to this problem could be that a property is marked as "on create only, prompt for value, do not ever store in state".

In the case of a database resource being created, after a terraform apply the CLI would prompt me for this value in the console (so its never provided in any file or environment or anything), is used to initially create the resource, but is never written into a state file and never tracked for changes.

To me I'd rather not have the sensitive data in terraform at all.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/hashicorp/terraform/issues/516#issuecomment-766250210, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAABZHLTCSYXD5UY3ZJOM4LS3NULDANCNFSM4AWPUOSQ .

devurandom commented 3 years ago

In the case of a database resource being created, after a terraform apply the CLI would prompt me for this value in the console (so its never provided in any file or environment or anything), is used to initially create the resource, but is never written into a state file and never tracked for changes.

To me I'd rather not have the sensitive data in terraform at all.

Interesting approach. I like the general direction, but have some questions regarding UX.

Assuming you create a database user using Terraform. This user receives an auto-generated password by the provider, which would so far be stored in the Terraform state and be accessible to other parts of your Terraform code. How do you propose to get it out of Terraform and into another secret storage service? I would have used terraform output -json, but that pulls from the state, which would no longer be an option.

Plasma commented 3 years ago

I’m quite new to terraform, but wanting to use it to manage an architecture for a hobby or small site, for example, and maybe another team member down the road.

To create an RDS instance, I need to supply a password in the HCL code for the password attribute. Bugger.

I wish it would just prompt me for a password, which I’ll provide manually (eg, I generated it using a password safe), then terraform can have its required variable to create the resource.

When the create completes, terraform doesn’t need that variable value, ever. I don’t want or need it to monitor the password field in RDS for changes, it’s outside it’s field of concern in my view.

Any other terraform apply commands will work without needing to even know about the password ever again (unless a replace happens at which point it should prompt for the password value again).

I feel like I’d always opt to not let terraform ever persist or have to monitor the secret password field, it’s not what I want terraform for.

Perhaps a plan output does persist to disk temporarily? That may solve your issue.

I think the main thing is, let me flag a value as “create only, prompt when needed, never persist”.

mr-miles commented 3 years ago

I looked at the code for handling reading and writing state files, and I think I found a point where its straightforward to encrypt/decrypt attributes marked as sensitive as they are read or written from state files. That means that the behaviour wrt compares and updates of resources can stay unchanged, so its much smaller.

This is the changeset: https://github.com/mr-miles/terraform/commit/3e466c168428afb633334680ce6cb40dbea63953

Obviously at the moment its just changing the sensitive value to "REDACTED", but with (a lot of) polish, a certificate could be fed through and the thumbprint written in the state. Then you'd be able to share the statefile freely without exposing secrets.

I'm sure I'm missing something (maybe I've read the code wrong) so maybe someone with more knowledge of the codebase could chime in.

bruj0 commented 3 years ago

RE https://github.com/hashicorp/terraform/issues/516#issuecomment-106611574

I just want to point out that, according to official documentation, storing the state file in version control is a best practice: https://www.terraform.io/intro/getting-started/build.html

This appears not to be the case anymore, the paragraph is missing from that link.

Tthe recomendation is now :

https://www.terraform.io/docs/language/state/sensitive-data.html#recommendations

Storing state remotely can provide better security.

* Terraform Cloud always encrypts state at rest and protects it with TLS in transit. Terraform Cloud also knows the identity of the user requesting state and maintains a history of state changes. This can be used to control access and track activity. Terraform Enterprise also supports detailed audit logging.

* The S3 backend supports encryption at rest when the encrypt option is enabled. IAM policies and logging can be used to identify any invalid access. Requests for the state go over a TLS connection.
bruj0 commented 3 years ago

RE #516 (comment)

Please read a few more posts before replying to something from more than 5 years ago.

I have and the last comment before mine was done 7 days ago.

Tbohunek commented 3 years ago

Hi @jbardin I don't understand https://github.com/hashicorp/terraform/issues/28292#issuecomment-814154466.

Terraform cannot store only the hash of the value, because the original value is required for terraform's operation

Have you already described anywhere what does terraform use the original value for beyond comparison? Thanks.

jbardin commented 3 years ago

Hi @Tbohunek,

If the attribute is referenced anywhere else in the configuration, terraform must preserve the original value in order to propagate it through the configuration. It must also be stored in order to satisfy the provider protocol, as providers require any stored value to be returned unchanged (see Terraform Resource Instance Change Lifecycle).

Tbohunek commented 3 years ago

Thanks @jbardin but I still don't get it. Would you mind sparing a few more minutes to give a specific example of where hash would not work?

In state there is no propagation. Each attribute value is stored explicitly. In plan the current value is retrieved from the runtime variable and propagates from there. That said, terraform should compare just the hash of each propagated value with the state-stored hash. Why would this not work?

whiskeysierra commented 3 years ago

In plan the current value is retrieved from the runtime variable and propagates from there.

What do you mean with runtime variable?

Tbohunek commented 3 years ago

@whiskeysierra I mean that when I terraform plan, the desired value of the (sensitive) attribute is present in secrets.auto.tfvars or the like. So Terraform can propagate it downstream, and in each instance check its hash with hash stored in .tfstate. If the provider is to return any value, it would be the new value because either the old value has changed and is not relevant, or is unchanged and remains untouched in state.

jbardin commented 3 years ago

Would you mind sparing a few more minutes to give a specific example of where hash would not work?

A primary example is the provider protocol, where a stored value is required to be returned per the contract of the API. There may be options to extend the protocol in the future for other types of values and resources, but we must uphold the agreed upon API with existing providers.

In state there is no propagation. Each attribute value is stored explicitly. In plan the current value is retrieved from the runtime variable and propagates from there.

Directly retrieving a variable at runtime to be used in an ephemeral manner during plan is only a subset of the problem. All configuration evaluation is based on the state. If a resource has stored a value which itself has marked as sensitive, and that value is referenced by another resource, it must be retrieved from the state in order to evaluate the reference. After initial creation, if that value is required for subsequent plans of any resources, we again need the original value to send to the provider.

Tbohunek commented 3 years ago

If a resource has stored a value which itself has marked as sensitive, and that value is referenced by another resource, it must be retrieved from the state in order to evaluate the reference.

Could this really not be done with the hash only? This evaluation should work the same provided it knows it's going to get a hash. Current implementation is really unlucky.

mr-miles commented 3 years ago

It is indeed unlucky.

What I think does work straightforwardly is for the state to hash/unhash sensitive values when it is read/written, using a key provided in an argument. Then the "real" value can be presented to plugins and the API maintained, but is not stored in state in plain text.

There's a PoC changeset here to show it in action: https://github.com/mr-miles/terraform/commit/3e466c168428afb633334680ce6cb40dbea63953

On Wed, Jul 28, 2021 at 6:00 PM Tomáš Bohuněk @.***> wrote:

If a resource has stored a value which itself has marked as sensitive, and that value is referenced by another resource, it must be retrieved from the state in order to evaluate the reference.

Could this really not be done with the hash only? This evaluation should work the same provided it knows it's going to get a hash. Current implementation is really unlucky.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/hashicorp/terraform/issues/516#issuecomment-888470435, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEQD4EUN5WNITVGZK2UMQTT2AZUPANCNFSM4AWPUOSQ .

spstarr commented 3 years ago

At the risk of asking something simple

why not:

data "aws_ssm_parameter" "key" {
  skip_store_tfstate = "true/false" <-- or something
  name = "/path/to/some/key"
  with_decryption = "true"
}

So either on terraform apply/destroy it forces Terraform to query AWS again for the value every time. Then it's never stored in tfstate and we don't have this issue of having to use Vault which defeats the purpose of using SSM in the first place.....

Of course the caveat is that the SSM value isn't changed or if it has changed, may or may not break terraform destroy but it's acceptable risk to me, you could just fix the SSM key(s) in that case.

Tbohunek commented 3 years ago

@spstarr in order for Terraform to detect if the value has changed, it has to store some information about the value.

But it shouldn't store the value itself, it should be capable of storing just its hash.

spstarr commented 3 years ago

@Tbohunek but does it need to know if its changed or not if we've explicitly told it to ignore state? A hash is fine

Tbohunek commented 3 years ago

@spstarr it does, if it should avoid trying to alter the actual resource. Imagine this value is VM Admin password. Do you want Terraform to reset the Admin password on every run? No? Then it value/hash needs to be stored in state.

spstarr commented 3 years ago

@Tbohunek well I don't want Terraform to reset or change if if I've told it this value must always be queried from AWS every single time no state. I'm not sure why this concept isn't simple to implement? a do_not_store_state_or_check_state just get the value as-is from AWS.

acdha commented 3 years ago

@spstarr in order for Terraform to detect if the value has changed, it has to store some information about the value.

But it shouldn't store the value itself, it should be capable of storing just its hash.

It doesn't even need to do that: SSM Parameters and Secrets both have version numbers which increment every time the value is changed so it should be possible to treat this kind of like the way people use null resources for sequencing where it could avoid either decrypting the secret or triggering an update to the target resource unless the version number has changed since the last update. That would have the problem of not catching someone making a change to the target resource outside of Terraform without updating the corresponding Parameter/Secret so there could be an argument for that behavior conditional but that seems like a situation to strongly discourage.

jbardin commented 3 years ago

@spstarr, @acdha, Storing only a hash or changing the behavior of existing data resources are not viable options here, but we do have a proposal for an ephemeral resource type, which would work roughly the way you describe. Since there are difficulties in supplying credentials to providers other than the storage aspect, more solutions to avoid the credentials in state at all may come out of the related issue #29182.

frittentheke commented 3 years ago

Encryption of the whole state file prior to storing it remotely is being discussed and worked on in https://github.com/hashicorp/terraform/issues/9556 or rather https://github.com/hashicorp/terraform/pull/28603

dylanturn commented 2 years ago

Encryption of the whole state file prior to storing it remotely is being discussed and worked on in #9556 or rather #28603

And where/how do we store the state encryption keys?

leosco commented 2 years ago

Encryption of the whole state file prior to storing it remotely is being discussed and worked on in #9556 or rather #28603

And where/how do we store the state encryption keys?

Those would have to managed outside source control via Vault or other secrets management product; ansible-vault has similar support for secrets and identity providers to encrypt sensitive data before checking into source

Stavfilipps commented 2 years ago

not an expert but would it be possible to encrypt those secrets on the client side with a key stored for instance in the aws secrets manager? Than decryption happens on terraform apply after fetching the decryption key.

leosco commented 2 years ago

Yeah that’s exactly how ansible-vault works, so you’re checking the encrypted data into source; when it needs to be decrypted in CI or local dev, there are certain principals (people or services that have the private key) allowed to do so. However, you’re left with a chicken/egg problem of “how do we secure and check in the secret needed to secure and check in everything else”, which hints at the next approach:

Remove the need to keep track of secret data in source and instead reference secret resources (GCP or AWS Secrets Manager, for example) that have been provisioned in advance, either by separate IaC or a one-time administrative setup to load required secrets into the given provider before the rest of the architecture can consume them.

Imo it’s a much better DevEx and separates secret management from the rest of the code.

On Thu, Jun 30, 2022 at 13:34 lisutek @.***> wrote:

not an expert but would it be possible to encrypt those secrets on the client side with a key stored for instance in the aws secrets manager? Than decryption happens on terraform apply after fetching the decryption key.

— Reply to this email directly, view it on GitHub https://github.com/hashicorp/terraform/issues/516#issuecomment-1171496024, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACZOPYQEZKEIRBJ3FKSWKS3VRXLAZANCNFSM4AWPUOSQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

pj commented 2 years ago

I'm pretty ignorant about how data types work in terraform, but would it be possible to have a "ProviderManagedVariable" interface? This would be an interface for a type that stores sensitive state externally, but contains information about how to access that state via a provider, basically like a pointer or reference. Only the non-sensitive information like the provider name, provider version, key name, last changed date etc would actually be stored in the terraform state.

This would require users of the type to be rewritten to grab the external variable when necessary, but not store it in their state. Also, every user of the data type would have to configure the correct provider to work.

leosco commented 2 years ago

Yes; by default there are server-managed attributes and references for things like secrets and any resource defined outside the plan. You can also explicitly ignore/omit certain attributes using ‘lifecycle.ignore_changes’ which is how you might, for example, get around the common edge case where services whose environment variables and secrets are injected on deploy instead of on provision. Generally speaking, any attributes you don’t want to accidentally overwrite and be controlled server-side can should be listed in ‘lifecycle.ignore_changes’

On Sun, Jul 17, 2022 at 00:07 Paul Johnson @.***> wrote:

I'm pretty ignorant about how data types work in terraform, but would it be possible to have a "ProviderManagedVariable" type? This would be a type that stores sensitive state externally, but contains information about how to access that state via a provider, basically like a pointer or reference. Only the non-sensitive information like the provider name, provider version, key name, last changed date etc would actually be stored in the terraform state.

This would require users of the type to be rewritten to grab the external variable when necessary, but not store it in their state. Also, every user of the data type would have to configure the correct provider to work.

— Reply to this email directly, view it on GitHub https://github.com/hashicorp/terraform/issues/516#issuecomment-1186392350, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACZOPYUIMSRTT7OG6TJ647DVUOBG3ANCNFSM4AWPUOSQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

leosco commented 2 years ago

This would be handled outside source control via Vault or other secrets management product; ansible-vault has similar support for secrets and identity providers to encrypt sensitive data before checking into source

On Thu, May 5, 2022 at 9:51 PM Dylan Turnbull @.***> wrote:

Encryption of the whole state file prior to storing it remotely is being discussed and worked on in #9556 https://github.com/hashicorp/terraform/issues/9556 or rather #28603 https://github.com/hashicorp/terraform/pull/28603

And where do we store the state encryption keys?

— Reply to this email directly, view it on GitHub https://github.com/hashicorp/terraform/issues/516#issuecomment-1119191155, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACZOPYXYAVTSXXR66ISR6ODVIR3IRANCNFSM4AWPUOSQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

p4gs commented 1 year ago

It's 2023. Are there any plans at all to make it so that Terraform has the ability to not expose plaintext credentials in .tfstate?

matthewcummings commented 1 year ago

@jbardin can you explain why it's not possible to store the hash of a sensitive value (instead of the value itself) in the TF state?

jbardin commented 1 year ago

@matthewcummings, this thread is getting long and a lot of comments are hidden: https://github.com/hashicorp/terraform/issues/516#issuecomment-888429946

mr-miles commented 1 year ago

@jbardin @matthewcummings long indeed!

it is true that the providers need access to the unhashed value to compare for the plan, but it is possible for the encryption to be applied to individual values (ones marked sensitive) on state read/write which is outside any particular provider.

https://github.com/hashicorp/terraform/issues/516#issuecomment-888476418

that has a link to an example branch of how it would work. Disclaimer: that branch is quite old now so may need some polish. Which can be done if it’s of interest.

jorhett commented 1 year ago

I find it deeply disturbing that https://developer.hashicorp.com/terraform/plugin/sdkv2/best-practices/sensitive-state#don-t-encrypt-state is written with an explanation that makes it sound like this is a solved problem, and yet there is no list of backends which encrypt sensitive values, nor is there any way from reading the providers to determine this. "Oh, it's solved... you have to figure it out for yourself" is not an answer.

Futhermore, I have tested with most of the remote state providers (s3, pg, gcp, etc) and in every case the sensitive values are stored in cleartext on my node, and trivially available for output during a plan/apply. In short, there is zero protection that I can find.

I do not believe that this problem is solved, in even the most basic way. I believe that the reasons for encrypting values in state are as relevant and essential now as they ever have been. I believe that the documentation above which describes this as a solved problem is woefully inaccurate and bad enough in its recommendations to users to be worthy of a security report.

bgshacklett commented 1 year ago

@jbardin, back on the subject of storing a hash:

A primary example is the provider protocol, where a stored value is required to be returned per the contract of the API. There may be options to extend the protocol in the future for other types of values and resources, but we must uphold the agreed upon API with existing providers.

While it would require an extension of the protocol, this does not seem to be a significant blocker; the protocol has been updated many times in the past, and I would hope it would continue to be updated as technology progresses. This could, initially, take the form of a feature flag which enables hashing of sensitive fields or data. If the provider opts in, it can expect that any returned sensitive value will be a hash of the original input.

If a resource has stored a value which itself has marked as sensitive, and that value is referenced by another resource, it must be retrieved from the state in order to evaluate the reference.

Why? Is there business logic which requires this behavior, or is it an implementation detail, which would be in scope of any resolution to this issue? Terraform is responsible for resource state management. I.e.: it writes to the state file. With that in mind, it must have that value in memory at some point. Why can't Terraform use this in-memory value to evaluate the reference, rather than referring to the state file?

After initial creation, if that value is required for subsequent plans of any resources, we again need the original value to send to the provider.

This is only the case due to the existing API contract. That aside, I see no reason why Terraform must retrieve this value from the state file for a plan. It could be a required variable, which is then hashed and compared with the stored hash to determine whether the value has been updated or not.

whiskeysierra commented 1 year ago

If it's a required variable you haven't solved anything. Someone has to provide that variable and it's value needs to be kept somewhere.

On Tue, 21 Feb 2023, 18:42 Brian G. Shacklett, @.***> wrote:

@jbardin https://github.com/jbardin, back on the subject of storing a hash:

A primary example is the provider protocol, where a stored value is required to be returned per the contract of the API. There may be options to extend the protocol in the future for other types of values and resources, but we must uphold the agreed upon API with existing providers.

While it would require an extension of the protocol, this does not seem to be a significant blocker; the protocol has been updated many times in the past, and I would hope it would continue to be updated as technology progresses. This could, initially, take the form of a feature flag which enables hashing of sensitive fields or data. If the provider opts in, it can expect that any returned sensitive value will be a hash of the original input.

If a resource has stored a value which itself has marked as sensitive, and that value is referenced by another resource, it must be retrieved from the state in order to evaluate the reference.

Why? Is there business logic which requires this behavior, or is it an implementation detail, which would be in scope of any resolution to this issue? Terraform is responsible for resource state management. I.e.: it writes to the state file. With that in mind, it must have that value in memory at some point. Why can't Terraform use this in-memory value to evaluate the reference, rather than referring to the state file?

After initial creation, if that value is required for subsequent plans of any resources, we again need the original value to send to the provider.

This is only the case due to the existing API contract. That aside, I see no reason why Terraform must retrieve this value from the state file for a plan. It could be a required variable, which is then hashed and compared with the stored hash to determine whether the value has been updated or not.

— Reply to this email directly, view it on GitHub https://github.com/hashicorp/terraform/issues/516#issuecomment-1438869992, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADI7HPBJNWRUXNWPZIKSCDWYT5BXANCNFSM4AWPUOSQ . You are receiving this because you were mentioned.Message ID: @.***>

bgshacklett commented 1 year ago

I'd say that's still a better scenario, because variables can be injected into the environment in many secure ways, but if we want to avoid passing any data into Terraform via the CLI, we can look at something like a cloud provider's secret management system, or a dedicated tool like Vault.

There will always be a need to store some secret somewhere. The key is to ensure that the secret is sufficiently protected via access controls and encryption, and that any access controls are granular enough to ensure that only those who should be able to see the data the protect have access, without restricting the ability for engineers to perform their jobs.

Tbohunek commented 1 year ago

Secrets are supposed to be stored in Vaults. Whether its an AWS/Azure/GCP Vault, or Hashicorp Vault.

But it is really bad if data on Secrets also export its value into tfstate. You generally need data to pass .id somewhere rather than plain .value.

Secret values must not leave Vaults, else what's the point of Vaults? This is crazy and I don't understand how it's seen as such an unimportant issue?!

Tbohunek commented 1 year ago

It would be great to pull secret values at runtime and only when .value is referenced... But we would have a pareto solution if https://github.com/hashicorp/terraform/pull/28603 got more attention.👈

bgshacklett commented 1 year ago

RE: #28603, While I would be willing to accept an e2e encryption based solution, I can't help but think that it would be more effort to implement. Encryption schemes have a lot of potential vulnerable pathways which need to be considered, and managing access controls brings significant additional complexity, as well. It would also be yet another thing that TF users would need to learn and understand the limitations of.

Does Terraform really want to get in the middle of all of that when we already have systems in place to handle sensitive data (e.g.: CSP and HashiCorp Vaults)? Removing sensitive data from Terraform's purview, as much as possible, seems like a simpler and less fragile alternative to me.

omarismail commented 1 year ago

👋 Hey everyone, my name is Omar and I'm the PM for Terraform. We are prioritizing this issue and looking to address this.

I'd like to have more a deeper and nuanced conversation with folks here on this matter. If anyone is interested, please reach out via email and we'll work to schedule something: oismail@hashicorp.com.

Thank you!

Jasper-Ben commented 1 year ago

Secrets are supposed to be stored in Vaults. Whether its an AWS/Azure/GCP Vault, or Hashicorp Vault.

But it is really bad if data on Secrets also export its value into tfstate. You generally need data to pass .id somewhere rather than plain .value.

Secret values must not leave Vaults, else what's the point of Vaults? This is crazy and I don't understand how it's seen as such an unimportant issue?!

Completely agree with you. The state of handling secrets in IaC environments is catastrophic. Sure, I can set up a hashicorp vault server, but what is the point, when vault secrets I create using the terraform vault provider are then stored in the terraform state file? That defeats the whole point of having a vault in the first place.

davidcorrigan714 commented 1 year ago

Removing sensitive data from Terraform's purview, as much as possible, seems like a simpler and less fragile alternative to me.

That's the approach we're trying to move to. Using HCL for access policy and not secret management seems like a winning plan to me and with a lot of platforms leaning into OIDC support this year there's a real potential to remove a large percentage of secrets and service accounts from our workloads. Though there's still a long ways to go, a lot of platforms only support OIDC one way which is a frustrating and a bit ironic. Terraform for example now has OIDC tokens available in plans & runs which I immediately put to use on some AWS infrastructure I'm building right now, but there's no way to authenticate into Terraform with OIDC. Vault should work as a token exchange there and we may do that, but the Terraform secret engine looks a bit janky with the one valid token at a time limit.

Ultimately there's likely some use cases that'll be around for quite awhile so some proper support would be nice. I'm patching a provider right now that tried to do some hashing tricks and ended up with more bugs than any practical value. In that case I think some of the bugs are fixable, but between their API having some issues with trying to avoid updating sensitive data, the difficultly in designing testing around those processes, and with the standard approach now to just store secrets plainly in the state file it seems better to be simple than to try to resolve the issues it has with the hashing.

omarismail commented 1 year ago

I'm patching a provider right now that tried to do some hashing tricks and ended up with more bugs than any practical value. In that case I think some of the bugs are fixable, but between their API having some issues with trying to avoid updating sensitive data

@davidcorrigan714 do you mind elaborating on this?

davidcorrigan714 commented 1 year ago

@omarismail Sure, I can go through our use case, which in its current state is a bit of a stepping stone to where we want to be later this year but should provide some context and a practical scenario.

We have a workspace that's primarily using 2 providers, the JFrog provider and the Azure DevOps provider. I setup sets of time_rotating objects to trigger API token rotations on the JFrog provider every week triggered by a cron job to run the workspace which then deploys those tokens to Azure DevOps. Immediately with this strategy there's a lot of secrets in that state file between the JFrog provider creating the tokens, and the Azure DevOps provider storing the attributes - sometimes as a hash. With any resource actually generating secrets I don't know how you could ever really protect that output - maybe encryption in the resource provider itself tied to some provided key? But, since the tokens provided as an output of another resource, fixing the storage of the token in the Azure DevOps provider doesn't resolve the full problem of having them in the state file. So while I hate to project our use case to all users of that provider when suggesting changes - we certainly have no benefit to fixing the fancy hash functions and I couldn't find any other provider that's actually gotten it to work.

The Azure DevOps provider tries to store secrets as hash. On sensitive fields it uses the DiffSuppressFunc schema option to call a function to check if the input matches the hashed data. There's also a helper function used later to blank out the password field that's saved in the state file and store a hash in an adjacent field. That all results in a "password" field that's blank and a "passsword_hash" field with the hash stored in the state file. That almost works, the most obvious bug there is that they used bcrypt which only covers the first 70-ish bytes of the password so as we started using JWTs as passwords there were update failures since the first 70 bytes are just the same header data from password to password. Switching to a different hash method resolves this issue of course, but certainly a case where implementers should have some guidance or standard to avoid this pitfall.

Another bug is that partial updates to the resource, like updating the name of a secret and not the value, are corrupting the secret data in the target platform and that seems to result from a few different paths. In that particular API, in theory it will accept a null field value to avoid updating the data so the theory would be to just use null when not updating the password - or the provider just always needs access to the password which then seemed like there'd be some interesting interactions with the DiffSuppressFunc. For example if the password didn't change, I think the DiffSupressFunc process would end up returning the blank data from the state file so there's no way to use that in an update. There may be some new options with the newer SDK, but no-one in that provider is tackling that migration for the foreseeable future. The Go API client library for Azure DevOps seems to have some errant data models as well, so the types require strings in some fields and wouldn't accept null as a value so doing a partial update through the official API client would require a lot of workarounds or duplicate code. With all the potential interactions of the DiffSuppressFunc trick, tracking hash values, and API issues I'm pursuing just ripping all that out to align with the current practice of storing the data in the state file.

There's also the issue that some of the code was reading sensitive data from the target platform and storing it back into the state file - which didn't work because of course the API just returns empty fields for sensitive fields. That could mostly be fixed by just never processing data from those fields when returned from the API. The structure of the code that's there, and the complexity of the various resource flows of creation, updating and importing I just didn't trust that I could find a solution that works in all cases.

Testing is another huge challenge. Most of the bugs we hit revolved around data corruption in the target platform. Since sensitive data can't be read back out directly through the API, validating that data is set correctly requires significant additional configuration in the target platform. In this case the Azure DevOps terraform provider just doesn't support enough of the extra configuration required so it would be quite a major undertaking to add all that infrastructure to the code base. Of course I wish that test was there regardless of whether the data is hashed or not, but again keeping it simple and just fixing up the overall structure of the code has a much higher ROI especially when we plan to switch all those static tokens to policy rules which end up validated at runtime through OIDC and the workloads' identities which will let us remove the secrets and tokens from Terraform entirely.

pauldraper commented 1 year ago

If you are willing to give up change detection for secrets, solving the problem of secrets in state is "easy," without needing any additional protocols or participation from plugins.

  1. Strip sensitive values from the state file before saving.
  2. Add lifecycle.ignore_changes for those attributes.

(Unfortunately, Terraform backends are not very customizable, so a workaround today is moderately difficult, as it requires creating an http server backend to strip the secrets.)

drdamour commented 1 year ago

Theres so many providers that create secrets by default (eh azure cosmos generates connection keys) that workaround is problematic..i know cause i do it.

wish there was a never store sesnitive flag