Open caquino opened 6 years ago
Hi Cassiano, right now there's no built-in way to Atlantis to overcome that issue however there are some mechanisms within Terraform itself and some options I'd like to evaluate for atlantis:
As per https://github.com/hashicorp/terraform/issues/16554#issuecomment-341849007, output secrets can be hidden by setting sensitive = true
:
output "out" {
value = "$secret"
sensitive = true
}
Resulting in an apply
output of:
Outputs:
out = <sensitive>
Within resources, it's provider dependent to annotate arguments as secret. For example, the aws_db_instance
resource marks its password
argument as secret, so:
resource "aws_db_instance" "default" {
# ...
username = "foo"
password = "foobarbaz"
}
Results in:
Terraform will perform the following actions:
+ aws_db_instance.default
...
password: <sensitive>
username: "foo"
Plan: 1 to add, 0 to change, 0 to destroy.
Just for my knowledge is there a particular resource you're managing that you're worried about?
I'm going to try out https://github.com/opencredo/terrahelp which has a mask
function and I'll get back to you if that works and if so how to use it within Atlantis.
@lkysow Thanks for such fast and comprehensive answer!
First of all please read this with the "brainstorm hat", I'm not demanding or telling what should be done.
We are aware that this is not really an Atlantis issue, its a terraform issue, but it turned out to be even more of a nuisance in our case because of having this content on GH comments.
Sometimes, data like URIs, IP addresses and such, that are not really "sensitive data" gets exposed which gives more information about our infrastructure than needed, I can imagine some possible options to solve the issue:
1 - To have options on atlantis.yml to specify regular expressions to filter out the plan output before submitting it to GH, which IMHO is not the best option because it may mask data that is useful for the user to determine if the plan should/should not be applied.
2 - In conjunction to 1, instead of masking the data as sensitive let the regular expression specify filters to encrypt parts of the plan output, which lets the user decrypt the data if it's something necessary for them to analyze the plan output.
Having an output similar to :
Terraform will perform the following actions:
+ aws_db_instance.default
...
password: ENCRYPTED:Hd8ajksa89alashdhkasd==
username: "foo"
Plan: 1 to add, 0 to change, 0 to destroy.
Which would allow users if necessary to decrypt the data if they have the key and even if they want to be fancy to build a browser extension to do it automatically.
Lets imagine that I could do something like: atlantis plan pgp_encrypt:(my pgp key id)
3 - For the plan an option to have the plan to be hosted on the Atlantis web interface itself, which I also don't like as it makes Atlantis a lot more complex as it will need to start having state to be able to fulfill compliance auditing reports.
Being completely honest, after writing the options down, none feels like a good option they all make me feel bad for different reasons, but most of all none feels like a good user experience. (I suspect this is what makes things more secure?)
BTW the idea of the encryption on the state output was inspired by https://github.com/mozilla/sops
Yes security tends to do that :) but I think this is definitely something that Atlantis should aim to tackle since it's the one posting the plan output on github! I think building this solution is best tackled with a well-defined use-case that I can fully understand. If you'd be interested, I'd love to have a quick call with you to more fully understand your use-case and how you deal with secrets. You can email me at lkysow [at] gmail if you'd like to set that up or message me in the slack channel. I think it would be dangerous to build this in a vacuum without a real-world use case.
Also brainstorming hat on...
A fully authed and rich Atlantis UI is a direction I want to go but it's not a short-term solution. I could do something short-term where you can mark a plan output as sensitive, causing it to only be available on the atlantis server as just a hosted file and then rely on users to add their own auth system on top of the UI (see https://github.com/runatlantis/atlantis/issues/49#issuecomment-398265180). For this I have a couple questions:
atlantis plan --sensitive
Without the UI, I think the solution is some form of:
terraform plan
and masking themI think 1 and 2 are easier than 3. And in terms of order of operations, I might want to implement 1 and 2, say for 3 that you just need to run plan locally (for now) and then build out 3 later. I know codecov.io has a simple UI-based secret mechanism where you paste in your string and it gives you the encrypted version back, Atlantis could have the same thing but the opposite. Questions I have about that:
I've got Atlantis working with terrahelp mask
which masks sensitive variables from a plan. For example:
terraform plan | terrahelp mask
terrahelp
figures out which variables are sensitive simply by assuming that all variables in a terraform.tfvars
file (which Terraform includes automatically) are sensitive. Obviously if you put sensitive variables into that file and then checked it in, it would have no purpose so you would have to create that file at runtime, maybe by downloading from Vault?
Either way, maybe this helps. You can see the Atlantis config here: https://github.com/runatlantis/atlantis/issues/46#issuecomment-403841683 and you'll need the terrahelp
binary to be available where Atlantis is running.
You can read more about terrahelp here: https://github.com/opencredo/terrahelp
If #121 were implemented (store plan output on server) then this would also be a possible solution. I don't think you'd need to keep the plan
output around for auditing reasons because it's not the plan
that matters but the apply
log. If the apply
log could still be printed to the pull request (in which any sensitive outputs could be marked as such) then this would be another solution.
Unfortunately, sensitive
outputs is not covering all cases.
For example, if you use random_string
to generate passwords, the ID of the random_string
is... the content of the string generated. So in plan
or apply
, you would see random_string.database_password (ID:<the password>)...
And relying on terrahelp or string regexs is tricky, as a single change in output format in Terraform could break it and leak your secrets in Github comments.
Is there a way do mark variables as sensitive so that they don't show up in the plan output?
For example
source
resource "aws_sns_topic_subscription" "subscription" {
endpoint = "${var.url}${var.token}"
output
resource "aws_sns_topic_subscription" "subscription" {
endpoint = "https://www.url-example.com/path?access_token=SENSITIVE_DATA"
Not built in to Atlantis
Is there a way do mark variables as sensitive so that they don't show up in the plan output?
For example
source
resource "aws_sns_topic_subscription" "subscription" { endpoint = "${var.url}${var.token}"
output
resource "aws_sns_topic_subscription" "subscription" { endpoint = "https://www.url-example.com/path?access_token=SENSITIVE_DATA"
I know this issue is old, but I am running into this exact situation (API key in aws_sns_topic_subscription resource) being exposed by TF. Could we perhaps have a way to mark certain attributes like this as sensitive so they get masked by atlantis during plan/apply?
I know the issue is in the provider, but as mentioned above, it is Atlantis that makes this an issue since it posts comments to git. Disabling auto-planning and requiring people to look at the Atlantis server kind of defeats the purpose of Atlantis imo...
@h3lo I am not sure if it will be addressed in Atlantis.
You can try Terraform sensitive
output
output "db_password" {
value = aws_db_instance.db.password
description = "The password for logging in to the database."
sensitive = true
}
That attribute is only for outputs. I'm talking about the aws_sns_topic_subscription
resource, and more specifically, the endpoint
attribute, which is likely to be sensitive for anyone using some 3rd party's API over HTTPS.
I even tried creating a module with nothing in it but my sensitive data, and outputting it with the sensitive
attribute set, then using it in my aws_sns_topic_subscription
resource's endpoint
attribute by reference, but TF still outputs its interpolated value.
This isn't really a big risk when doing terraform from the command line, it's specifically a function of Atlantis making the plan and apply output semi-public that causes risk. Of course, this aspect of Atlantis is also one of the main reasons to use it (tracking of what's been done).
Atlantis generally works great, thank you for your hard work. I hope my feedback can help improvements be made that will be useful to everyone.
I think
TF still outputs its interpolated value
is the key bit here, and would encourage opening up an issue on the provider, since I think that's the right place to mark sensitivity rather than having special cases in various providers inside Atlantis. The same problem would happen in eg. Terraform Cloud, and can be avoided by setting the sensitive
flag in the provider schema.
Heck, it's probably a one-line PR, so you could be a Terraform contributor! 😄
(FWIW, I agree that endpoint is sensitive for the reasons you described and would be happy to upvote the issue there.)
@wlonkly I feel quite strongly that this doesn't belong in the provider. The provider outputting some sensitive data isn't inherently wrong because from TF perspective, the host you run plans and applies on is secure. It is Atlantis that makes this output public and therefore it is specifically Atlantis that should handle masking it.
@h3lo I think it would be nice if the provider provided a way for you to "mark" that something is sensitive, or even denoted sensitive data in some standardized way. Then, Atlantis could use that standard accordingly, instead of a potentially non-deterministic regex. This would still allow TF to spit out the real info, that you want to see when you run locally, but also allow Atlantis to scrub it when posting it in a PR.
Another thought..... after the PR merges, could / should the Atlantis comments be deleted? Just getting started with TF and Atlantis, so I'm no expert....but wouldn't that at least limit exposure. I know our TF PRs don't generally stay open for very long. By the time a dev is ready to PR, they should have done the plan/apply and even a destroy in a lower env.
I don't have enough history to know if I'd ever need to look back on those plan / apply PRs later to debug something.
For example, if you use random_string to generate passwords, the ID of the random_string is... the content of the string generated. So in plan or apply, you would see random_string.database_password (ID:
)...
@sterfield I know this is an old issue. I have a small addition here which may not have been available at the time.
https://registry.terraform.io/providers/hashicorp/random/latest/docs/resources/password
Identical to random_string with the exception that the result is treated as sensitive and, thus, not displayed in console output. Read more about sensitive data handling in the Terraform documentation.
We're having a similar issue with this masking values on the plan output. Due to some architecture decisions, we have some variables that are considered secrets, created on terraform and passed into templates variables to templatefile functions. Since they are passed as text, terraform loses the context of them being sensitive and outputs them as clear text on the plan. Before having automation on a PR this was not a concern, as the terraform executor would also have access to the terraform state file, so it would be acceptable to have it on an output, but now the value is available to anyone with access to the repository where Atlantis comments.
I understand this might be considered a hack, since the real solution is to address the way the secret is passed, but as we all know, sometimes we don't have the cycles to fix this when it's already working across multiple production environments and the business has other requirements.
My proposal is to add a feature to allow masking the plan output based on a configurable regex. The regex argument can be a list to make it easier to achieve multiple masks without increasing the complexity of the regex pattern. I've tested this locally with a simple pipe that sends terraform plan output to sed and it works as long as we can figure out a valid regex for the desired outcome, I believe this would be somewhat easy to achieve with Atlantis. Afaik Atlantis doesn't really interfere with terraform, just calls the binary and interprets results so I can't think of another way to achieve this.
I'm happy to develop this feature. Thoughts?
PS: Another option is to allow a custom post processing command on https://github.com/runatlantis/atlantis/blob/57aef30416f3e6e10d8c87222952f92deef61f3b/server/core/config/valid/repo_cfg.go#L176-L183
Hi,
We are evaluating adoption of Atlantis and one of our concerns regarding security is that while doing 'atlantis plan' secret data can be displayed on the plan output.
Is there any recommendation on how to overcome this issue?
Thanks