hashicorp / vault-k8s

First-class support for Vault and Kubernetes.
Mozilla Public License 2.0
784 stars 171 forks source link

Tokens not revoked when using agent-revoke-on-shutdown if still in template phase #210

Open wernerb opened 3 years ago

wernerb commented 3 years ago

Describe the bug A wrong vault template is used (wrong permissions), this results in the main container not running and crashing and restarting. While its wrong that the vault template was faulty, this wasn't caught at first. The pod restarted all night (3000 times) and created a lot of vault tokens. This can potentially swap vault.

The following annotation is enabled:

        vault.hashicorp.com/agent-revoke-on-shutdown: "true"

the vault token is able to be retrieved but a certain secret is unable to be rendered. Many tokens are created in quick succession swamping vault.

The agent gets a signal to be killed here:

2021/01/15 11:41:32.523517 [INFO] (runner) stopping
2021-01-15T11:41:32.526Z [INFO]  template.server: template server stopped
2021-01-15T11:41:32.526Z [INFO]  auth.handler: shutdown triggered, stopping lifetime watcher
2021-01-15T11:41:32.526Z [INFO]  auth.handler: auth handler stopped
2021-01-15T11:41:32.526Z [INFO]  sink.server: sink server stopped
2021-01-15T11:41:32.526Z [INFO]  sinks finished, exiting
2021-01-15T11:41:32.526Z [ERROR] runtime error encountered: error="template server: vault.read(x/database/elasticsearch/creds/y): vault.read(x/database/elasticsearch/creds/y): Error making API request.
URL: GET https://xxx/v1/xdatabase/elasticsearch/creds/y
Code: 403. Errors:
* 1 error occurred:
    * permission denied
"
Error encountered during run, refer to logs for more details.

but does not execute token revoke actions.

To Reproduce Steps to reproduce the behavior:

  1. Deploy application annotated with

        vault.hashicorp.com/agent-revoke-on-shutdown: "true"

    Request a secret for which you have no permission or does not exist:

        vault.hashicorp.com/agent-inject-secret-elasticsearch.yml: "true"
        vault.hashicorp.com/agent-inject-template-elasticsearch.yml: |
          {{- with secret "bla/database/elasticsearch/creds/xyz" }}
          output.elasticsearch.username: {{ .Data.username }}
          output.elasticsearch.password: {{ .Data.password }}
          {{ end }}
  2. Observe that the pod crashes, vault tokens are not revoked on shutdown of agent

  3. See error (vault injector logs, vault-agent logs, etc.)

Application deployment:

# Paste your application deployment yaml here.
# Be sure to scrub any sensitive values!

Expected behavior The annotation is honored even if the leases/secrets generated are unavailable/permisison denied. As long as the token is retrieved it should be revoked on exit.

Environment

jasonodonnell commented 3 years ago

Hey @wernerb, thanks for the report.

I think this might be a problem with Kubernetes and lifecycle hooks, but not 100% sure. The token revocation is being done by a preStop hook and I think because the container is crashing, that lifecycle hook never gets actitvated.

Speaking internally with my team about possibly enhancing Vault Agent to do this instead of relying on preStop.

andreizzu commented 1 year ago

I have the same issue with the vault.hashicorp.com/agent-revoke-on-shutdown: "true" annotation. I checked the events on the pod and I can seea 403 error message in the "vault-agent" pod when the hook is triggered. If I try the same call using curl and the token provided at the path: /home/vault/.vault-token, it works correctly. Somehow, if the call to revoke the token is done using the cli, the proper token is not set to the vault cli. Any ideas? Many thanks!