open-policy-agent / gatekeeper

🐊 Gatekeeper - Policy Controller for Kubernetes
https://open-policy-agent.github.io/gatekeeper/
Apache License 2.0
3.61k stars 740 forks source link

Using Secrets in Constraints #664

Closed theMagicalKarp closed 1 year ago

theMagicalKarp commented 4 years ago

What

It'd be nice to provide guidance or features for using API keys in the constraints.

For example I'd like the ability to contact another service within the cluster VIA HTTP with an API key.

My hope was that I could mount the secret API key as an env var in gatekeeper and reference the env var in my policy. However opa.runtime is empty.

violation[{"msg": msg}] {
    env := opa.runtime().env
    env.FOO_API_TOKEN

    response := http.send({
        "method": "get",
        "url": "https://blahblah.com",
        "headers": {"Authorization": env.FOO_API_TOKEN},
    })

    response.status_code == 200
    # use the response for determining assertions
}

Is this a sane thing to do? Should I being handling this another way?

maxsmythe commented 4 years ago

We are not currently populating OPA with the contents of ENV. I wonder at the security consequences of doing so, though I'm guessing if the variables are only read at startup it should be fairly safe.

An alternative could be to replicate secrets into Gatekeeper, but that would mean all secrets are now cached in Gatekeeper, at least until we have per-namespace replication enabled as an option.

JPLachance commented 3 years ago

Hello @maxsmythe,

I have the exact same use case. I'm writing a policy that checks whether or not a container image was approved by our deployment pipeline. That policy needs to contact an API and that API is secured.

How can we use secrets inside a Gatekeeper ConstraintTemplate without hardcoding those secrets in the Rego code?

Thanks in advance!

maxsmythe commented 3 years ago

If you are comfortable allowing Gatekeeper to keep a copy of your secrets in the Gatekeeper pod's memory and the potential that other constraint templates will be able to access the secret's contents, you can set up caching for secrets using the instructions linked below:

https://github.com/open-policy-agent/gatekeeper#replicating-data

JPLachance commented 3 years ago

Hi @maxsmythe,

I'm confortable with a Kubernetes Secret being used to inject an environment variable into the OPA Gatekeeper pod.

Using environment variables would simplify my life a lot, as it is standard between opa test and the actual Gatekeeper work. I'm writing UTs for each constraint.

At the moment, I don't feel like data replication is a solution because it uses a Kubernetes ConfigMap, not a Kubernetes Secret, which does not help on the security side. Also, I would have to mock the data replication object in UTs, which adds complexity.

I think the ideal solution is to add a feature flag on the Gatekeeper (an environment variable?) than enables populating OPA with the contents of ENV. That way, Gatekeeper users can chose.

What do you think?

maxsmythe commented 3 years ago

A flag that defaults to off certainly improves things. I'm still reluctant to include a behavior that has a long association with security vulnerabilities into Gatekeeper (even defaulted to disabled), unless we can convince ourselves that the feature is generally safe.

I'm not sure what you mean by sync uses ConfigMaps and not secrets? You can sync resources of any kind into Gatekeeper.

Is mocking a secret value more effort than maintaining an environment variable? I would also worry about flaky unit tests as they would pass/fail on a per-environment basis.

If you wanted to use caching for k8s but environment variables for the unit test, you could add something like:

key() = ret {
   runtime := opa.runtime()
   ret = runtime.env.MY_KEY_VAR
}

test_something {
    violation with data.inventory.namespace["my-namespace"]["v1"]["Secret"]["my-api-key"].data.key as key()
}

to your unit tests, which should take your environment variable and use it as the cached value for the API key.

FWIW I agree it should be easier to emulate caching in constraint unit tests... that's something we should look at for our library authorship tools.

The factors I'm weighing here are:

  1. Is it possible to achieve the same behavior without this feature? (maybe... it depends on whether we need to scope down syncs more)
  2. How much extra effort is it to have the same behavior without this feature? (maybe not much extra, per above)
  3. How risky is this feature? (unknown, but similar features have a history of security issues)
  4. How hard would it be to improve (1) vs (3) in dev hours/complexity?

Defaulting to off mitigates some of the risk. If we can convince ourselves that the risk is worth the benefit after (4) is taken into account, I'm open.

@shomron @brycecr @ritazh @sozercan I'm curious what your thoughts are on the potential risk of exposing env vars to Rego.

shomron commented 3 years ago

As we discussed in the community meeting today, this feature request requires further analysis. Some high level considerations:

shomron commented 3 years ago

Another thought - environment variables as an interface introduces challenges when updating tokens - requiring restarting of all GK pods to take effect. This has implications on service availability and the ability for heterogeneous GK instances to make policy decisions that depend on an up-to-date secret.

JPLachance commented 3 years ago

Hello!

I did not consider a multi-tenant environment in my request. My context is a Kubernetes cluster where I deployed Gatekeeper and where I control the access roles of this Kubernetes cluster users. I control who can create/modify/delete a ConstraintTemplate.

In a different context, yes, anyone with access to create or modify a ConstraintTemplate could use that access to exfiltrate sensitive data.

That being said, if someone has access to create or modify a ConstraintTemplate, what is preventing that someone from creating a ConstraintTemplate with input.review.object.kind == "Secret" that would do an http.send to post that Secret data to an endpoint of that someone's choice? As far as I know, the Gatekeeper ClusterRole is currently allowed to get secrets from Kubernetes and in most clusters, Kubernetes stores secrets base64 encoded.

About the environment variable requiring the Pod restart, this is true! The ideal solution is to fetch the secret dynamically from a vault, but we still need to authenticate with that vault. I use AWS Secrets Manager a lot, but, I don't know how we could fetch a secret from AWS Secrets Manager using Rego. We normally use this to access AWS resources from a Pod, but we need a proper AWS SDK that supports it.

Thanks a lot for all the time you are investing on this question by the way! 😄

jdoylei commented 2 years ago

Hi there,

Like @theMagicalKarp and @JPLachance, I'm interested in OPA/Rego policies that access external data on-demand via http.send(), from an endpoint requiring authentication. My first thought was also opa.runtime().env, which seems to come up in discussions about credentials in OPA/Rego in general (aside from Kubernetes and Gatekeeper).

@maxsmythe, your suggestion about using the "Replicating Data" Gatekeeper feature and K8s Secrets sounds workable also, but can you please let me know whether I'm understanding it correctly?

  1. Assume my policy wants to authenticate to https://foo.com with basic auth username/password gatekeeper_user/User123!
  2. I'd create a generic Secret named "foo-gatekeeper-user" in namespace "gatekeeper-system" with data like username: Z2F0ZWtlZXBlcl91c2Vy and password: VXNlcjEyMyE=
  3. Would there need to be any special role bindings for the Gatekeeper pods to read this new Secret?
  4. Does it make sense to use namespace "gatekeeper-system", since the secret will be available for any constraint in the cluster to use?
  5. I'd update Gatekeeper's Config resource with syncOnly that includes kind "Secret".
  6. Is there any more to the syncOnly criteria than "kind"? Like namespace or a selector for specific resources? (Is there a doc with the full API/CRD for things like Config, beyond the examples in the main Gatekeeper doc?)
  7. My Rego policy would access using username := data.inventory.namespace["gatekeeper-system"]["v1"]["Secret"]["foo-gatekeeper-user"].data.username (and the same for password).
  8. Would the syncing for Secrets and the population of the data document give me back "User123!" for password, or "VXNlcjEyMyE=", requiring a base64 decode in Rego?

Thanks!

maxsmythe commented 2 years ago

Would there need to be any special role bindings for the Gatekeeper pods to read this new Secret?

Not with the default configuration, but you'd need to configure G8r to cache the secret.

Does it make sense to use namespace "gatekeeper-system", since the secret will be available for any constraint in the cluster to use?

Hard to say. gatekeeper-system certainly seems like a good place to store gatekeeper-specific config, but maybe you'd want to use a different namespace to give yourself room to wipe/reinstall G8r with minimal fuss? I think either should be fine, with the caveat that a different namespace makes name collisions less likely (not that they were likely to begin with).

I'd update Gatekeeper's Config resource with syncOnly that includes kind "Secret".

Yep

Is there any more to the syncOnly criteria than "kind"? Like namespace or a selector for specific resources? (Is there a doc with the full API/CRD for things like Config, beyond the examples in the main Gatekeeper doc?)

https://open-policy-agent.github.io/gatekeeper/website/docs/exempt-namespaces

should be useful for exempting namespaces from sync. I don't think we have a way to scope "for this kind, only sync these namespaces" currently, unfortunately.

My Rego policy would access using username := data.inventory.namespace["gatekeeper-system"]["v1"]["Secret"]["foo-gatekeeper-user"].data.username (and the same for password).

I think so? There may be a need to decode/parse the secret value, but that's the right general path

Would the syncing for Secrets and the population of the data document give me back "User123!" for password, or "VXNlcjEyMyE=", requiring a base64 decode in Rego?

That I can't say off the top of my head :/ Though I do believe Rego can decode base64 and parse json.

I'll be out for the rest of the year, hopefully this context is helpful!

jdoylei commented 2 years ago

Yes, thanks for your quick reply!

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.