hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io/
Other
41.73k stars 9.42k forks source link

Storing sensitive values in state files #516

Open seanherron opened 9 years ago

seanherron commented 9 years ago

309 was the first change in Terraform that I could find that moved to store sensitive values in state files, in this case the password value for Amazon RDS. This was a bit of a surprise for me, as previously I've been sharing our state files publicly. I can't do that now, and feel pretty nervous about the idea of storing state files in version control at all (and definitely can't put them on github or anything).

If Terraform is going to store secrets, then some sort of field-level encryption should be built in as well. In the meantime, I'm going to change things around to use https://github.com/AGWA/git-crypt on sensitive files in my repos.

bitglue commented 9 years ago

See #874. I changed the RDS provider to store an SHA1 hash of the password.

That said, I'm not sure I'd agree that it's Terraform's responsibility to protect data in the state file. Things other than passwords can be sensitive: for example if I had a security group restricting SSH access to a particular set of hosts, I wouldn't want the world to know which IP they need to spoof to gain access. The state file can be protected orthogonally: you can not put it on github, you can put it in a private repo, you can use git-crypt, etc.

kubek2k commented 9 years ago

related #689

dentarg commented 9 years ago

Just want to give my opinion on this topic.

I do think Terraform should address this issue. I think it will increase the usefulness and ease of use of Terraform.

Some examples from other projects: Ansible has vaults, and on Travis CI you can encrypt informaton in the .travis.yml file.

ketzacoatl commented 9 years ago

Ansible vaults is a feature I often want in other devops tools. Protecting these details is not as easy as protecting the state file.. what about using consul or Atlas as a remote/backend store?

+1 on this

dayer4b commented 9 years ago

I just want to point out that, according to official documentation, storing the state file in version control is a best practice:

https://www.terraform.io/intro/getting-started/build.html

Terraform also put some state into the terraform.tfstate file by default. This state file is extremely important; it maps various resource metadata to actual resource IDs so that Terraform knows what it is managing. This file must be saved and distributed to anyone who might run Terraform. We recommend simply putting it into version control, since it generally isn't too large.

(emphasis added)

Which means we really shouldn't have to worry about secrets popping up in there...

hobbeswalsh commented 9 years ago

:+1: on this idea -- it would be enough for our case to allow configuration of server-side encryption for S3 buckets. Any thoughts on implementing that?

apparentlymart commented 8 years ago

At the risk of adding scope to this discussion, I think another way to think of this is that Terraform's current architecture is based on a faulty assumption: Terraform assumes that all provider configuration is sensitive and that all resource configuration isn't sensitive. That is wrong in both directions:

So all of this is to say that I think overloading the provider/resource separation as a secret/non-secret separation is not the best design. Instead, it'd be nice to have a mechanism on both sides to distinguish between things that should live in the state and things that should not, so that e.g. generated secrets can be passed into provisioners but not retained in the state, and that the state can encode that a particular instance belongs to a particular AWS region and respond in a better way when the region changes.

There are of course a number of tricky cases in making this situation, which I'd love to explore some more. Here are some to start:

little-arhat commented 8 years ago

Hi, any progress on that? Terraform 0.6.3 still stores raw passwords in the state file. Also, as a related issue, if you do not want to keep passwords in configuration, you can create variable without default value. But, this will force you to pass this variable every time you run plan/apply, even if you're not going to change resource that has this password.

I think, it would be nice to separate sensitive stuff from other attributes, so it will:

So, for configuration like:

variable db {
    password {}
}

resource ... {
    password = "${var.db.password}"
}

terraform will require variable for the first run, when it doesn't have anything, but will not require on subsequent runs.

To change such value one need to provide different value for password.

EvanKrall commented 8 years ago

Maybe there's a simple solution: store the state in Vault?

mwarkentin commented 8 years ago

A good solution for this would be useful for us as well - we're manually configuring certain things to keep them out of the tfstate file in the meantime.

ascendantlogic commented 8 years ago

So as I slowly cobble together another clean-sheet infra with Terraform I see this problem still exists, and this issue is almost exactly 1 year old. What is the thinking in regards to solving this? the ability to mark specific attributes within a resource as sensitive and storing SHA1 or SHA2 hashes of their values in the state for comparison? I see this comment on a related ticket, does that mean that using Vault will be the prescribed way? I get that it promotes product synergy but I'd really like a quick-n-dirty hashing solution as a fallback option if I'm honest.

ketzacoatl commented 8 years ago

Moving secrets to vault, and using consul-template or integration with other custom solutions you have for CM certainly helps for a lot of cases, but completely avoiding secrets in TF or ending up in TF state is not always reasonable.

ascendantlogic commented 8 years ago

Sure, in this particular case I don't want to manually manage RDS but I don't want the PW in the state in cleartext, regardless of where I keep it. I'm sure this is a somewhat common issue. Maybe an overarching ability to mark arbitrary attributes as sensitive is shooting for the moon but a good start would be anything that is obviously sensitive, such as passwords.

jfuechsl commented 8 years ago

Would it be feasible to open up state handling to plugins? The standard could be to store it in files, like it is currently done. Other options could be Vault, S3, Atlas, etc.

That way this issue can be dealt with appropriately based on the use-case.

brikis98 commented 8 years ago

I just got tripped up by this as well, as the docs explicitly tell you to store .tfstate files in version control, which is problematic if passwords and other secrets end up in the .tfstate files. At the bare minimum, the docs should be updated with a massive warning about this. Beyond that, there seem to be a few options:

  1. Offer some way to mark variables as secret and either ensure they never get stored in .tfstate files or store them in a hashed form.
  2. Encrypt the entire .tfstate file.
  3. Remove the recommendation to store .tfstate files in version control and only recommend them to be stored in secure, preferably encrypted storage.
ejoubaud commented 8 years ago

One thing to consider around this is output. When you create a resource with secrets (key pair, access keys, db password, etc.), you likely want to show the secret in question at least once (possibly in the stdout of the first run, as output do)

Currently output are also stored in plain text in the .tfstate, and can be retrieved later with terraform output.

One possible solution would be a mechanism to only show the secrets once, then not store them at all and not show them again (like AWS does), possibly using only-once output as I suggested in #4437

revett commented 8 years ago

+1

sstarcher commented 8 years ago

+1

ascendantlogic commented 8 years ago

To get around this for now in my production RDS I just created the instance with a password of changeme1234 and then went to the console and manually changed the PW.

gtmtech commented 8 years ago

+1

I notice also that Redshift is imminently going to be supported on terraform, and the same mistakes are being made all over again:

master_password - (Required) Password for the master DB user. Note that this may show up in logs, and it will be stored in the state file

"Applications should not transmit or store passwords in unencrypted form"

Page 77 - ISO27001 :

https://books.google.co.uk/books?id=Ur1lviHCd-4C&pg=PA77&dq=no+password+unencrypted+disk+iso27001&hl=en&sa=X&ved=0ahUKEwibuMrywb3KAhUCVxQKHR7yCNwQ6AEIPTAB#v=onepage&q=no%20password%20unencrypted%20disk%20iso27001&f=false

Can this please not be done - it instantly means for lots of us (for compliance reasons) we cant use it - the whole "no password should be written down on any disk unencrypted" thing is a killer.

This has been in progress for over a year - is there any attempt to solve this? I would have thought the simplest approach would be to hash the password, store the hash.

johnrengelman commented 8 years ago

@gtmtech a hash isn't cryptographically safe either because it can be reversed. The right solution here is something that can store the value securely, doing anything else IMO would be a waste of energy.

ketzacoatl commented 8 years ago

@gtmtech, if this is a blocker, can you put in a goof password on first run, and then manually update it, as @ascendantlogic notes above? While not "clean", and while it "gives you something to do", that seems like a reasonable middle ground, no?

ascendantlogic commented 8 years ago

@johnrengelman forgive me if I am misunderstanding, but I thought hashes were, by definition, one way. Or at least any reasonable use of one to add some level of protection to secrets necessitates the use of a one-way?

gtmtech commented 8 years ago

So problems with the comments above are:

Maybe there's a way of not storing the password at all, and not caring about it? When I worked on configuration management tools before, I did this approach sometimes - it was never stored in state, and the value was always set to be ignored, so the diff algorithm just completely ignored it, but it was used on first create only. This isnt ideal either as you cant manage passwords with TF, but its better than unencrypted, readable master database passwords everywhere.

Its a real shame to not be able to use terraform for something as dumb as plaintext readable master passwords, its such a great tool otherwise!

johnrengelman commented 8 years ago

@ascendantlogic ah yeah, sorry, don't know where my brain was this morning. I'm blaming that fact that I hadn't had coffee yet.

gtmtech commented 8 years ago

I would recommend bcrypt or even scrypt for the password hashing.

In the meantime, a goof password with ignore_changes=["master_password"] will suffice so long as I can get terraform to NOT store it in the state file.

A meta_parameter to accomplish not storing a particular atribute would be an alternative to having to do the hashing work - either one could accomplish the end goal

sstarcher commented 8 years ago

@gtmtech A goof password should not cause terraform states to complain every time. As long as the password in the tf files and the password in the state file are the same terraform should be happy.

I created a RDS database using a password in the TF files and after creation went into the TF file and changed it to XXXXX. I also went into the state file and change it to XXXXX. terraform plan and terraform apply are happy.

gtmtech commented 8 years ago

@sstarcher interesting, what happens when you terraform refresh, does it not override your XXXXXing out of the password?

sstarcher commented 8 years ago

@gtmtech nothing happens it continues to work happily. The state file still contains XXXXX as terraform has no way of knowing what the actual database password is.

I just tested to confirm the following results in no change and my password in the tf state is still XXXXX which is not my actual password

ascendantlogic commented 8 years ago

@gtmtech that is not what happens. I modified the password after the fact in the console and TF is perfectly happy. I think it only wants to change the PW on the RDS resource if the value in the tf file doesn't match the tfstate file, the AWS API does not return the RDS password for TF to compare against the tfstate, that would be madness.

johnrengelman commented 8 years ago

@ascendantlogic @gtmtech The RDS resource in Terraform doesn't populate the password field during a read/refresh - https://github.com/hashicorp/terraform/blob/master/builtin/providers/aws/resource_aws_db_instance.go#L531

So changes in the AWS console won't be exposed down into Terraform.

ascendantlogic commented 8 years ago

@johnrengelman right, RDS revealing any password information from their API would be madness.

ketzacoatl commented 8 years ago

Given these details, shouldn't we simply omit the password from the statefile completely? I see no point in storing it... do you? maybe there is a reason, maybe we can make that optional, but if it is only used once with the API at creation time...

gtmtech commented 8 years ago

Yeah exactly - if you cant ever compare it with whats in AWS, then it makes no sense whatsoever to store it in the statefile at all, and then all the madness goes away.

Thanks for your help in the meantime, I have a workaround which is as you say, to use an initial stupid password, put ignore_changes on the master_password attribute, and then once its created, change it through the console.

This could go in the documentation perhaps

apparentlymart commented 8 years ago

The reason to include it in the state file is to allow it to be interpolated into other values. For example, to create an RDS instance and then use the Postgres or MySQL provider to set up users and databases.

Currently Terraform depends on the state file for interpolations like this because the database resource (for example) might not be created in the same run as the RDS instance.

Of course there are other ways to write the config such that you wouldn't depend on the state file to get the password, so it could well be interesting to have Terraform recognize such situations and only store the password when it is necessary. The UX for that would need to be figured out: it wouldn't do to silently add a password to the state file as a side effect of other config changes.

jonapich commented 8 years ago

Requiring a custom encryption key/password would be useful. For instance, a resource that contains only sensitive data requires the user to configure a key somewhere, just like his AWS credentials, so that Terraform can read and apply them.

Another idea would be to introduce a concept of 'volatile' variables that are used at runtime, but never make it to the state file. It's the responsibility of the caller to generate the proper variables before calling terraform. In my scenario, a custom python script fetches configuration data from DynamoDB and feeds it to a var file. And since most of the time you cannot "get" the password to an existing resource, you cannot compare it and fix it anyway. I think for these cases, it would be OK for terraform to ignore modifications to these values.

(there were a lot of comments above, sorry if I'm repeating stuff, I didn't read it all)

mattupstate commented 8 years ago

FWIW, I'm devised a way to set the master password without keeping it in plain text. It's probably just considered clever, and not a real great, sustainable solution. It uses a combination of Ansible, a simple Python script, and a null_resource. https://gist.github.com/mattupstate/27f2bf26d3712b6b7973

bsiegel commented 8 years ago

I also want to vote for this issue. I'm using terraform to set up a VPN connection within an AWS VPC, and the aws_vpn_connection resource exports things like tunnelN_preshared_key and customer_gateway_configuration that I'd rather not end up in the .tfstate file. Unfortunately, this data gets refreshed from AWS during a refresh/plan/apply, so I can't work around the problem like I have with database passwords (as suggested above, I used a fake password in Terraform and then changed it later manually). I tried adding those fields to ignore_changes for the aws_vpn_connection resource, but that didn't prevent them from being fetched from AWS and added to the .tfstate file.

squaresurf commented 8 years ago

For what it's worth, I've been having a good experience using AGWA/git-crypt.

sheerun commented 8 years ago

I'd vote for making considering whole tfstate as secret file and store everything in them, including initial variables. For now resources like TLS store in state secret values like private keys anyway..

It would solve two issues as well: #5425 #5424

Everything about deployed architecture should be a secret, so there is no point considering tfstate public.

squaresurf commented 8 years ago

I think the main issue is that the terraform docs say that the tfstate file should be committed to your repo without mentioning anything about it's sensitive nature. I also think that it should be considered a secret file going forward.

When I get some time, I'm hoping to work on a remote config backend for vault, which in my opinion would be a perfect place to store the tfstate files.

McCodeman commented 8 years ago

+1

nickithewatt commented 8 years ago

For those who may be interested. I have written a go CLI terrahelp which aids with doing the encryption and decryption of terraform .tfstate files, including using Vault's transit aka "encryption as a service" functionality. Code can be found here

gtmtech commented 8 years ago

FWIW , here's what we do now taking as an example an rds instance:

resource "aws_db_instance" "rds_example" {

   ...
   rds details etc ...
   ...
   password = "temporarypasswordOverriddenBelow"
   ...

    provisioner "local-exec" {
        command = "bash -c 'DBPASS=$$(openssl rand -base64 16) ; aws rds modify-db-instance --db-instance-identifier ${self.id} --master-user-password $${DBPASS} --apply-immediately"
    }

This will set an (unstored) new password as soon as the DB is setup. The command above can be extended to upload the new password straight to KMS or wherever you securely want to store it too.

ProTip commented 8 years ago

I'm concerned about aws_iam_server_certificate's as well; along with any other resource that might save secrets to the state file.

acbox commented 7 years ago

+1

bernielomax commented 7 years ago

Is there any update on this issue? I would really like to see this as a built in feature. Being able to handle sensitive info these days is just a given in any automation tool.

bjtucker commented 7 years ago

...crickets...

wow.

sheerun commented 7 years ago

Is it possible to write remote state provider as a plugin? I could imagine one that writes encrypted tfstate instead.

apparentlymart commented 7 years ago

Remote state providers are not currently plugin-able, but they are modular and relatively easy to write.

In this case I'm not sure if an entirely new remote state provider is warranted, but rather a new idea of encrypting the state before writing it into whatever backend it's configured to write to. There are several different possibilities for doing the encryption part too, suggesting that this be a new setting on remote state that's orthogonal to storage:

The earlier suggestion of storing the remote state directly in Vault (presumably in the generic secret backend) was a good one too, and that would be a separate implementation, though that would only encrypt the state when writing to remote; the local file cached on local disk could not itself be encrypted under that approach.