hashicorp / vault-k8s

First-class support for Vault and Kubernetes.
Mozilla Public License 2.0
789 stars 169 forks source link

Add an annotation to restart pod on secret update #196

Open night-gold opened 3 years ago

night-gold commented 3 years ago

Hello,

Is your feature request related to a problem? Please describe.

When using secrets as env variables, as we need to load them using a non conventional method (using the args to load the variables) it is not possible for the pod to load the secrets even with the sidecar present. Currently this only updates the secrets in the file and nothing more.

Describe the solution you'd like

Using the vault agent sidecar, add an annotation to force restart the pod if secrets changes.

Describe alternatives you've considered

Maybe this issue could help: #14 , I imagine that if the env variables were changed in the pod description it would force restart the pod.

KaiReichart commented 3 years ago

+1 this would be super useful for our Java-Applications which load the secrets on startup

tvoran commented 3 years ago

Hi @night-gold, thanks for the suggestion! Have you tried the agent-inject-command annotation? https://www.vaultproject.io/docs/platform/k8s/injector/annotations#vault-hashicorp-com-agent-inject-command

That can be used to run kill against your app, after the secret is rendered. Jason's demo shows an example of this: https://github.com/jasonodonnell/vault-agent-demo/blob/master/examples/dynamic-secrets/patch-annotations.yaml#L15 Note that the app and injector need to be running with the same UID, and shareProcessNamespace: true should be set on the deployment.

night-gold commented 3 years ago

Hello @tvoran , we will take a look at that method.

But I have a question, what happens if the PID doesn't exists or the request for the pid returns an error, that should happen on start if I'm not mistaken, because the agent will render the template before the pod is available for it to have them before it's started. So that will generate an error on purpose to manage the restart of the pod.

I'm not really a big fan of the method, I think a better use of this would be to be able to use that inject-command to run something for the env variable load in place of the recommended method here: vault k8s env variable . In that case we should have any error as the pods should share the env variable using the shareProcessNamespace.

Still I will take a look at what can be done with that and tell you if that does answer our issue.

KaiReichart commented 3 years ago

If you are on OpenShift I have built a vault image with the oc binary https://hub.docker.com/repository/docker/kaireichart/vault

Then you can use the aget-inject-command to redeploy the deployment/statefulset/whatever:

vault.hashicorp.com/agent-inject-command-{{ .fileName }}: {{ ( print "sh -c 'if [ -f \"/etc/secrets/" .fileName ".check\" ]; then oc rollout restart deployment " $.Release.Name "; else touch /etc/secrets/" .fileName ".check; fi'" ) }}

Since the inject command will also run for the init-container you will want to create a file the first time the command is run and only do the redeployment when this file exists (you can write it into the secrets mountpath).

Also, your ServiceAccount needs to be elevated slightly (don't forget the RoleBinding):

kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: redeployer
rules:
  - verbs:
      - patch
      - list
      - get
    apiGroups:
      - apps
    resources:
      - daemonsets
      - deployments
      - deployments/rollback
      - deployments/scale
      - replicasets
      - statefulsets
      - statefulsets/scale

This also means you don't need to share the ProcessNamespace between the containers which also is a security benefit.

It should also work on vanilla Kubernetes (with a new Docker Image) but I've only tried on OpenShift. If demand is there I can build another image with kubectl.

night-gold commented 3 years ago

@tvoran So I tried to use your method for the restart but I'm having issues... I activated the inject command but it doesn't seems to launch, I added that to my annotations:

vault.hashicorp.com/agent-run-as-same-user: "true"
vault.hashicorp.com/agent-inject-command: "sh -c 'temp=$(ps -aux | grep -v grep | grep npm | aws \'{print $2}\') && kill -HUP $temp'"

I also added the share namespace on the pod, but the issue is that it does nothing when it updates the template. I don't have logs anywhere saying that I have an launch error, so I'm a bit lost, where can I see the logs of the inject-command? In my case I can't find it in the agent nor the targeted container...

mouglou commented 3 years ago

Hello @night-gold !

Did you find a solution about that ? I tried to run the "kubectl rollout restart" via the agent-inject-command annotation, but the command is running on the vault-agent container, which doesn't have the kubectl package. And it seems complicated to add the package on it...

As @KaiReichart, we generate a secret file generated by the database engine for Java application. But With max_ttl: 6h (for example), we need to rollout the deployment at 2/3 to be good.

Hope you find something ;)

night-gold commented 3 years ago

@mouglou We aren't on openshift so the solution did not seems to be possible in our case, it would be better to have a more broad solution already integrated.

For now we are doing it manually or using scripts but it's still not the proper way to do it...

mouglou commented 3 years ago

@night-gold thanks for your feedback. We are not on OC too, we run on GKE.

Hope the team can do something! Running the kubectl command within the vault-injector seems a logical option for Kubernetes!

mouglou commented 3 years ago

Good morning !

Maybe @jasonodonnell have some tips about running the kubectl command in the vault-agent container!

KaiReichart commented 3 years ago

This is my Dockerfile for the oc client in vault-agent, it should be easy to modify it to install kubectl aswell:

FROM frolvlad/alpine-glibc:alpine-3.10_glibc-2.30

# This is the release of Vault to pull in.
ARG VAULT_VERSION=1.5.4

# Create a vault user and group first so the IDs get set the same way,
# even as the rest of this may change over time.
RUN addgroup vault && \
    adduser -S -G vault vault

# Set up certificates, our base tools, and Vault.
RUN set -eux; \
    apk add --no-cache ca-certificates gnupg openssl libcap su-exec dumb-init tzdata && \
    apkArch="$(apk --print-arch)"; \
    case "$apkArch" in \
        armhf) ARCH='arm' ;; \
        aarch64) ARCH='arm64' ;; \
        x86_64) ARCH='amd64' ;; \
        x86) ARCH='386' ;; \
        *) echo >&2 "error: unsupported architecture: $apkArch"; exit 1 ;; \
    esac && \
    VAULT_GPGKEY=91A6E7F85D05C65630BEF18951852D87348FFC4C; \
    found=''; \
    for server in \
        hkp://p80.pool.sks-keyservers.net:80 \
        hkp://keyserver.ubuntu.com:80 \
        hkp://pgp.mit.edu:80 \
    ; do \
        echo "Fetching GPG key $VAULT_GPGKEY from $server"; \
        gpg --batch --keyserver "$server" --recv-keys "$VAULT_GPGKEY" && found=yes && break; \
    done; \
    test -z "$found" && echo >&2 "error: failed to fetch GPG key $VAULT_GPGKEY" && exit 1; \
    mkdir -p /tmp/build && \
    cd /tmp/build && \
    wget https://releases.hashicorp.com/vault/${VAULT_VERSION}/vault_${VAULT_VERSION}_linux_${ARCH}.zip && \
    wget https://releases.hashicorp.com/vault/${VAULT_VERSION}/vault_${VAULT_VERSION}_SHA256SUMS && \
    wget https://releases.hashicorp.com/vault/${VAULT_VERSION}/vault_${VAULT_VERSION}_SHA256SUMS.sig && \
    gpg --batch --verify vault_${VAULT_VERSION}_SHA256SUMS.sig vault_${VAULT_VERSION}_SHA256SUMS && \
    grep vault_${VAULT_VERSION}_linux_${ARCH}.zip vault_${VAULT_VERSION}_SHA256SUMS | sha256sum -c && \
    unzip -d /bin vault_${VAULT_VERSION}_linux_${ARCH}.zip && \
    cd /tmp && \
    rm -rf /tmp/build && \
    gpgconf --kill dirmngr && \
    gpgconf --kill gpg-agent && \
    apk del gnupg openssl && \
    rm -rf /root/.gnupg

# Install OC cli tool

RUN apk add --no-cache --virtual .build-deps \
        curl \
        tar \
    && curl --retry 7 -Lo /tmp/client-tools.tar.gz "https://mirror.openshift.com/pub/openshift-v4/x86_64/clients/oc/4.6/linux/oc.tar.gz"
RUN tar zxf /tmp/client-tools.tar.gz -C /bin oc \
    && rm /tmp/client-tools.tar.gz \
    && apk del .build-deps

# /vault/logs is made available to use as a location to store audit logs, if
# desired; /vault/file is made available to use as a location with the file
# storage backend, if desired; the server will be started with /vault/config as
# the configuration directory so you can add additional config files in that
# location.
RUN mkdir -p /vault/logs && \
    mkdir -p /vault/file && \
    mkdir -p /vault/config && \
    chown -R vault:vault /vault

# Expose the logs directory as a volume since there's potentially long-running
# state in there
VOLUME /vault/logs

# Expose the file directory as a volume since there's potentially long-running
# state in there
VOLUME /vault/file

# 8200/tcp is the primary interface that applications use to interact with
# Vault.
EXPOSE 8200

# The entry point script uses dumb-init as the top-level process to reap any
# zombie processes created by Vault sub-processes.
#
# For production derivatives of this container, you shoud add the IPC_LOCK
# capability so that Vault can mlock memory.
COPY docker-entrypoint.sh /usr/local/bin/docker-entrypoint.sh
ENTRYPOINT ["docker-entrypoint.sh"]

# By default you'll get a single-node development server that stores everything
# in RAM and bootstraps itself. Don't use this configuration for production.
CMD ["server", "-dev"]
mouglou commented 3 years ago

Hi @KaiReichart ,

Ok. I just check that is most of the part the default vault image and then you add what you need. I will try that with kubectl command.

With this solution, we have to update the tag variable when updates are available, but it can works.

Thanks for the idea.

KaiReichart commented 3 years ago

Yep its basically the default image, with the very important change of the base-image since the normal alpine image doesn't come with glibc, and oc won't run without it. Not sure if its needed for kubectl too though

mouglou commented 3 years ago

@KaiReichart

I create a docker image with a step to install the latest version of kubectl, and then add the command annotation.

I get this * failed to execute command "kubectl rollout restart deployment account-storage" from "(dynamic)" => "/vault/secrets/db.properties": child: fork/exec /usr/local/bin/kubectl: permission denied

The first time, patch permission was missing in the RBAC. Now the service account dedicated to this deployment has good permissions but still has this error. Without any other RBAC error. It's like, permission is ok now, but get this one... And for now, I don't know where it comes from... If I run any other kubectl command like 'version' or even 'kubectl rollout information', it's work.

Did you get something like that too with OC?

EDIT: to add, my kubectl binary have execute permission ;)

mouglou commented 3 years ago

Forget ! I set the executable option to the user only... I try again with executing option for everybody, it seems to works. I am testing right now with max_ttl to 30 minutes and check if my rollout restarts work well on my app.

mouglou commented 3 years ago

It doesn't work well... The kubectl rollout restart is running in loop, as it create a pod, this new pod create is new credentials, that is running the rollout again... So not really useful...

@KaiReichart Whats it your agent-inject-command ?

KaiReichart commented 3 years ago

@mouglou I had the same Problem, the init-container was triggering the rollout.

vault.hashicorp.com/agent-inject-command-{{ .fileName }}: {{ ( print "sh -c 'if [ -f \"/etc/secrets/" .fileName ".check\" ]; then oc rollout restart deployment " $.Release.Name "; else touch /etc/secrets/" .fileName ".check; fi'" ) }}

On the first run of the inject-command a file is created in the container. Only if this file is present, the rollout will be triggered. This guarantees that the first run (the init-container) does not trigger a new rollout.

This is taken straight from our helm-chart, so if you don't use helm you might want to replace .fileName and $.Release.Name to whatever is appropriate for you.

mattgialelis commented 3 years ago

@KaiReichart

tried that out that line you posted as

  vault.hashicorp.com/agent-inject-command-config.yml: "if [ -f \"/vault/secrets/config.check\" ]; then kubectl rollout restart deployment vault-agent-example ; else touch \"/vault/secrets/config.check\"; fi "

sadly seems. like the vault agent cuts down the command for some off reason? Any ideas

[INFO] (runner) executing command "if [ -f \"/vault/secrets/config.check\" ]; then kubectl rollout restart deployment vault-agent-example ; else touch \"/vault/secrets/config.check\"; fi " from "(dynamic)" => "/vault/secrets/config.yml"
[INFO] (child) spawning: if [ -f /vault/secrets/config.check ]

I did put a bash script into the image and called that directly and it works but dont really want to have to start adding scripts all over the place.. An annotation to do the restart from the agent would be alot cleaner

KaiReichart commented 3 years ago

@mattgialelis You're missing the sh -c 'XXX' part, for some reason the vault injector only runs the first expression. try this:

  vault.hashicorp.com/agent-inject-command-config.yml: "sh -c 'if [ -f \"/vault/secrets/config.check\" ]; then kubectl rollout restart deployment vault-agent-example ; else touch \"/vault/secrets/config.check\"; fi'"
mattgialelis commented 3 years ago

@KaiReichart Yeap just tried, had done it before too but needed the error again, i noticed if i break down the command by remove the ; it reads the whole line it seems to read the semi colons as a break or something.

[INFO] (child) spawning: sh -c if [ -f /vault/secrets/config.check ]
[: line 1: syntax error: unexpected end of file (expecting "then")

Even using the shorter sh version of

 "sh -c '[ -f /vault/secrets/config.check ] && kubectl rollout restart deployment vault-agent-example || touch /vault/secrets/config.check'"
INFO] (runner) executing command "sh -c '[ -f /vault/secrets/config.check ] && kubectl rollout restart deployment vault-agent-example || touch /vault/secrets/config.check'" from "(dynamic)" => "/vault/secrets/config.yml"
[INFO] (child) spawning: sh -c [ -f /vault/secrets/config.check ]
sh: missing ]

Wondering what version of Vaults agent are you running? Maybe they changed something in 1.7.0?

KaiReichart commented 3 years ago

@mattgialelis Thats weird... we are using vault agent v1.5.4, so it's very possible that this has changed in recent versions

mattgialelis commented 3 years ago

@KaiReichart Just did some testing 1.5.4, 1.6.3 work fine with the command.

1.7.0 breaks it somehow, might open another issue on it..Seems like it could break a few peoples setups

https://github.com/hashicorp/vault-k8s/issues/246

rihosims commented 3 years ago

One way might be to add a liveness probe to the app container that checks for a file to be present.

At the init of the pod the file is touched on a shared vault volume. Then the vault agent pod has the inject command annotation specified and you remove the control file from the shared volume. The liveness probe picks it up and restarts the app container.

deploy.yaml:

.....
vault.hashicorp.com/agent-inject-command-template-name: "rm /vault/secrets/control"
.....
image: jweissig/app:0.0.1
command: ["/bin/sh"]
args:
   ['-c', 'source /vault/secrets/database-config.txt && touch /vault/secrets/control && chown 100:1000 /vault/secrets/control && /app/web']
.....
livenessProbe:
  exec:
  command:
    - cat
    - /vault/secrets/control
  initialDelaySeconds: 5
  periodSeconds: 5
.....

EDIT:

mattgialelis commented 3 years ago

@rihosims I might actually try that out never even crossed my mind to do it like that. I guess would'nt there be a risk all the pods restarting at once and causing a small outage depending on the setup?

I actually modified the script from earlier in this thread to do a bit of a delay, i noticed when using dynamic secrets the script would cause an infinite loop of rolling restarts so i added a small backoff basically which worked perfectly, you'll just need to modify the paths and the restart command to suit but basically just stops it restarting constantly by checking the modified time is greater than 10 seconds.

 vault.hashicorp.com/agent-inject-command-config.env: "sh -c '{ if [[ -f /vault/secrets/config.check ]]; then if [ $(( $(date +%s) - $(stat -c %Y /vault/secrets/config.check) )) -gt 10 ]; then kubectl rollout restart deployment vault-agent-example ;  fi else touch /vault/secrets/config.check; fi }'"
rihosims commented 3 years ago

@mattgialelis

Yes, thanks for pointing that out.

I should probably state that I haven't tried this approach in production and I can't vouch for the safety of it regarding all containers restarting at once etc. However I'm currently also racking my brain how to this the most easiest and sanest way possible and I thought this might be one way. I will update this thread once I have more results.

EDIT: It seems that by default the vault agents have their own internal interval when they check for new key versions. When I tested the replicas didn't restart at the same time. However I don't know if this is ideal anyway because when you invalidate the password the application stops functioning so you need to get the changes quickly rolled out rather than going one by one. I guess it comes down to the specific application and how your authentication system is set up etc. and needs to be looked at on a case by case basis.

SomKen commented 3 years ago

One way might be to add a liveness probe to the app container that checks for a file to be present.

At the init of the pod the file is touched on a shared vault volume. Then the vault agent pod has the inject command annotation specified and you remove the control file from the shared volume. The liveness probe picks it up and restarts the app container.

deploy.yaml:

.....
vault.hashicorp.com/agent-inject-command-template-name: "rm /vault/secrets/control"
.....
image: jweissig/app:0.0.1
command: ["/bin/sh"]
args:
   ['-c', 'source /vault/secrets/database-config.txt && touch /vault/secrets/control && chown 100:1000 /vault/secrets/control && /app/web']
.....
livenessProbe:
  exec:
  command:
    - cat
    - /vault/secrets/control
  initialDelaySeconds: 5
  periodSeconds: 5
.....

EDIT:

* When touching the control file you need to account for the different UID's between the containers.

* Agent-inject-command needs the template name inside the parameter name agent-inject-command[-template-name]

This works, but is pretty jank. Wish there was a better method.

lungdart commented 2 years ago

Everyone is giving great advice on how to simulate this feature, but I'm wondering if this feature request itself is being considered?

It would be nice to just give an annotation to a deployment to force a rolling update on secret change without any custom injector containers w/ kubectl, Roles, Rolebindings, and matching ServiceAccounts.

lungdart commented 2 years ago

In case anyone wants to implement this feature using the command annotation, I ran into a few differences then what others reported.

Using vault 1.9.2, the initContainer of the agent templates the secrets, but the sidecar ALSO templates the secret on startup, resulting in not 1, but 2 commands occuring that need to be ignored. This may be a bug with how the sidecar is working.

Here is an script I used to determined the deployment from the pod, and update an annotation to force the update strategy to trigger:

#!/bin/sh -ex

# Vault agent will call this script every time a secret is rendered into a
#  template. For some reason, the init car renders it first, and then the
#  sidecar also re-renders it. I think this is an error on vaults side since it
#  contradicts the documentation and purpose of the init car. If you are here to
#  debug, this is likely the issue.
LOCK1="/vault/secrets/restart.1"
LOCK2="/vault/secrets/restart.2"
LOGFILE="/vault/secrets/restart.log"
if [[ ! -f "$LOCK2" ]]; then
    if [[ ! -f "$LOCK1" ]]; then
        touch "$LOCK1"

        # Generate logs
        touch "$LOGFILE"
        echo "=======================" >> "$LOGFILE"
        echo "$(date) Initialized (1)" >> "$LOGFILE"
        echo "=======================" >> "$LOGFILE"
    else
        touch "$LOCK2"
        echo "=======================" >> "$LOGFILE"
        echo "$(date) Initialized (2)" >> "$LOGFILE"
        echo "=======================" >> "$LOGFILE"
    fi

    # Force an exit to prevent re-deploy during bootup
    exit
fi

# LOGFILE management
echo "------------" >> "$LOGFILE"
echo "$(date) Triggered" >> "$LOGFILE"
echo "------------" >> "$LOGFILE"

# Setup paths and endpoints
CA="/var/run/secrets/kubernetes.io/serviceaccount/ca.crt"
SERVER="https://kubernetes.default.svc.cluster.local"

# Ingest values about kubernetes
POD="$(hostname)"
NAMESPACE="$(cat /var/run/secrets/kubernetes.io/serviceaccount/namespace)"
TOKEN="$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)"

# Discover the pod owner that we want to trigger a restart on
POD_DATA="$(curl -s -L --cacert "$CA" -H "Authorization: Bearer $TOKEN" "$SERVER/api/v1/namespaces/$NAMESPACE/pods/$POD")"
OWNER_KIND="$(echo "$POD_DATA" | jq -r .metadata.ownerReferences[0].kind)"
OWNER_NAME="$(echo "$POD_DATA" | jq -r .metadata.ownerReferences[0].name)"
if [[ "$OWNER_KIND" == "null" || "$OWNER_KIND" == "" || "$OWNER_NAME" == "null" || "$OWNER_NAME" == "" ]]; then
    echo "[!] Couldn't find pod owner" >> "$LOGFILE"
    echo "$POD_DATA" >> $LOGFILE
    exit
else
    echo "$OWNER_KIND - $OWNER_NAME" >> "$LOGFILE"
fi

# Go deeper to see if the replicaset is controlled by a deployment
if [[ "$OWNER_KIND" == "ReplicaSet" ]]; then
    echo "[*] ReplicaSet owner, checking for deployment" >> $LOGFILE
    OWNER_DATA="$(curl -s -L --cacert "$CA" -H "Authorization: Bearer $TOKEN" "$SERVER/apis/apps/v1/namespaces/$NAMESPACE/replicasets/$OWNER_NAME")"

    OLD_OWNER_KIND="$OWNER_KIND"
    OLD_OWNER_NAME="$OWNER_NAME"
    OWNER_KIND="$(echo "$OWNER_DATA" | jq -r .metadata.ownerReferences[0].kind)"
    OWNER_NAME="$(echo "$OWNER_DATA" | jq -r .metadata.ownerReferences[0].name)"
    if [[ "$OWNER_KIND" == "null" || "$OWNER_KIND" == "" || "$OWNER_NAME" == "null" || "$OWNER_NAME" == "" ]]; then
        echo "[~] Couldn't find replicaset owner." >> "$LOGFILE"
        echo "$OWNER_DATA" >> "$LOGFILE"
        OWNER_KIND="$OLD_OWNER_KIND"
        OWNER_NAME="$OLD_OWNER_NAME"
    fi
fi

# Universal patch to force the resource to update using it's update strategy
PATCH_DATA="{
    \"spec\": {
        \"template\": {
            \"metadata\": {
                \"annotations\": {
                    \"talemhealth.ai/dynamicSecretRestartAt\": \"$(date)\"
                }
            }
        }
    }
}"

# Trigger the restart (Kind based)
if [[ "$OWNER_KIND" == "Deployment" ]]; then
    echo "[*] Restarting Deployment $OWNER_NAME"
    RESULTS=$(curl -s -L \
      --cacert "$CA" \
      --header "Authorization: Bearer $TOKEN" \
      --header "Content-Type: application/strategic-merge-patch+json" \
      --data-raw "$PATCH_DATA" \
      -X PATCH "$SERVER/apis/apps/v1/namespaces/$NAMESPACE/deployments/$OWNER_NAME?pretty=true")

    echo "[*] Update triggered on Deployment pod owner" >> "$LOGFILE"
    echo "$RESULTS" >> "$LOGFILE"

elif [[ "$OWNER_KIND" == "ReplicaSet" ]]; then
    echo "[*] Restarting Deployment $OWNER_NAME"
    RESULTS=$(curl -s -L \
      --cacert "$CA" \
      --header "Authorization: Bearer $TOKEN" \
      --header "Content-Type: application/strategic-merge-patch+json" \
      --data-raw "$PATCH_DATA" \
      -X PATCH "$SERVER/apis/apps/v1/namespaces/$NAMESPACE/replicasets/$OWNER_NAME?pretty=true")

    echo "[*] Update triggered on ReplicaSet pod owner" >> "$LOGFILE"
    echo "$RESULTS" >> "$LOGFILE"

elif [[ "$OWNER_KIND" == "StatefulSet" ]]; then
    echo "[*] Restarting Deployment $OWNER_NAME"
    RESULTS=$(curl -s -L \
      --cacert "$CA" \
      --header "Authorization: Bearer $TOKEN" \
      --header "Content-Type: application/strategic-merge-patch+json" \
      --data-raw "$PATCH_DATA" \
      -X PATCH "$SERVER/apis/apps/v1/namespaces/$NAMESPACE/statefulsets/$OWNER_NAME?pretty=true")

    echo "[*] Update triggered on StatefulSet pod owner" >> "$LOGFILE"
    echo "$RESULTS" >> "$LOGFILE"
else
    echo "[!] Invalid kind: $OWNER_KIND. Aborting..." >> "$LOGFILE"
    echo "$OWNER_KIND - $OWNER_NAME" >> "$LOGFILE"
    exit
fi

You'll also need to add a role and a rolebinding to the serviceaccount for the pods that allows those API requests to k8s:

piVersion: v1
kind: ServiceAccount
metadata:
  name: restarter-serviceaccount
  namespace: default
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: default
  name: restarter-role
rules:
- apiGroups: ["", "apps"]
  resources: ["pods", "deployments", "replicasets", "statefulsets"]
  verbs: ["get", "update", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: restarter-binding
  namespace: default
subjects:
- kind: ServiceAccount
  name: restarter-serviceaccount
  namespace: default
roleRef:
  kind: Role
  name: restarter-role
  apiGroup: rbac.authorization.k8s.io

You'll need to extend the vault docker image to include curl and jq and host it somewhere your cluster can pull from

FROM hashicorp/vault:1.9.2

# Install new dependencies
RUN set -eux \
&&  mkdir -p /custom && chown vault:vault /custom \
&&  apk update && apk add curl jq

# Install our custom tooling
COPY --chown=vault:vault restart.sh /custom/restart.sh
RUN chmod +x /custom/restart.sh

# Match configuration of original Dockerfile: https://github.com/hashicorp/vault/blob/main/Dockerfile
ARG BIN_NAME
ARG NAME=vault
ARG VERSION
ARG TARGETOS TARGETARCH
ENV NAME=$NAME
ENV VERSION=$VERSION
VOLUME /vault/logs
VOLUME /vault/file
EXPOSE 8200
ENTRYPOINT ["docker-entrypoint.sh"]
CMD ["server", "-dev"]

Finally you can annotate the deployment with the script, and you can access the logs within the /vault/secrets directory of your main container if you need to troubleshoot.

# Example using 2048 with a useless postgres dynamic secret attached
apiVersion: apps/v1
kind: Deployment
metadata:
  name: "2048"
  namespace: default
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: "2048"
  revisionHistoryLimit: 3
  replicas: 3
  template:
    metadata:
      labels:
        app.kubernetes.io/name: "2048"
      annotations:
        vault.hashicorp.com/agent-inject: "true"
        vault.hashicorp.com/role: "2048"
        vault.hashicorp.com/agent-inject-secret-db: "database/creds/2048"
        vault.hashicorp.com/agent-inject-template-db: |
          {{- with secret "database/creds/2048" -}}
          postgres://{{ .Data.username }}:{{ .Data.password }}@postgres:5432/mydb?sslmode=disable
          {{- end }}
        vault.hashicorp.com/agent-inject-command-db: "/custom/restart.sh"
    spec:
      serviceAccountName: restarter-serviceaccount
      containers:
      - name: "2048"
        image: alexwhen/docker-2048
        imagePullPolicy: IfNotPresent
udungg commented 2 years ago

+1000 waiting built in feature on vault

IQNeoXen commented 1 year ago

Is there any news on this? :) Having a built-in feature to restart pods/containers without dirty workarounds would be really useful imho.