GalleyBytes / terraform-operator

A Kubernetes CRD to handle terraform operations
http://tf.galleybytes.com
Apache License 2.0
366 stars 47 forks source link

Enforce model defined in tfstate #97

Open o-orand opened 2 years ago

o-orand commented 2 years ago

As a terraform-operator user, In order to ensure tfstate is always in sync with underling infrastructure, and to reduce manual operations, I need a mechanism to automatically and frequently execute the terraform workflow.

Use case samples:

isaaguilar commented 2 years ago

This boils down to an automatic reconciler which is mentioned in issue https://github.com/isaaguilar/terraform-operator/issues/84.

The idea of auto reconciliation sounds nice but I'd need to put some thought into how it would actually work.

I could think of a workaround that might help.

Here's how it works:

  1. Understand that the Terraform Operator starts the terraform workflow when changes are detected on the TFO resource
  2. On an interval, update the TFO resource, something like the spec.env with something like a revision number.
  3. Updating the resource will trigger a new build.

Personally, I have used an env like the following to force trigger builds:

kind: Terraform
metadata:
  name: my-tfo-resource
spec:
  ...
  env:
  - name: _REVISION
    value: "10" # a counter or random string would work

If you have a setup like above, you should be able to write a cron or an infinite loop to change the "_REVISION".

while true; do
  kubectl patch terraform my-tfo-resource --type json -p '[
    {
      "op": "replace",
      "path": "/spec/env/0",
      "value": {"name":"_REVISION","value":"'$RANDOM'"}
    }
  ]'
  sleep 600
done

Every 10 minutes, this script will update the terraform which will auto-reconcile.

o-orand commented 2 years ago

Thanks @isaaguilar for the workaround. I will give it a try.

o-orand commented 2 years ago

This workaround works quite well, I've used CronJob, instead of job. The main drawback is persistence volume consumption, as providers are downloaded on each run. So we have to use cleanupDisk, but we may loose root cause error on failure. Another alternative, is to implement a custom cleanup to remove unwanted data.

Below some extracts of my configuration:

apiVersion: batch/v1
kind: CronJob
metadata:
  name: tfo-reconciliation-workaround-job
spec:
  # currently triggering manually
  schedule: "1/5 * * * *"
  failedJobsHistoryLimit: 5
  jobTemplate:
    spec:
      ttlSecondsAfterFinished: 86400 #keep the job for 24 hours to access its logs
      template:
        spec:
          serviceAccountName: my-service-account
          containers:
          - name: update-tfo-reconciliation-marker
            image: bitnami/kubectl:1.21
            securityContext:
              runAsUser: 0
            command:
              - '/bin/sh'
              - '-c'
              - '/scripts/workaround.sh' #This script is the kubectl patch command you provided previously
            volumeMounts:
              - name: script
                mountPath: "/scripts/workaround.sh"
                subPath: workaround.sh
          volumes:
          - name: script
            configMap:
              name: tfo-reconciliation-workaround
              defaultMode: 0777
          restartPolicy: Never

It also requires specifics roles to interact with terraform operator:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: tfo-workaround-role
  namespace: <my-namespace>
rules:

  - apiGroups: [""]
    resources:
      - secrets
      - configmaps
    verbs:
      - "*"
  - apiGroups:
      - tf.isaaguilar.com
    resources:
      - terraforms
    verbs:
      - list
      - get
      - patch
o-orand commented 2 years ago

Hello @isaaguilar, after running this workaround for a few weeks, we've hit a limitation: new ConfigMap and Secrets are generated on each run, and kept forever. See sample below

kubectl get terraforms.tf.isaaguilar.com
tf-harbor-internet   19d
tf-harbor-intranet   19d

kubectl get configmaps,secrets|grep "tf-harbor"|cut -d'-' -f1-3|sort|uniq -c
   3210 configmap/tf-harbor-internet
   2598 configmap/tf-harbor-intranet
   3788 secret/tf-harbor-internet
   2858 secret/tf-harbor-intranet

Is it possible to kept only last xxx executions ?

isaaguilar commented 2 years ago

A few hours ago I released v0.8.2, which changes the behavior of keepLatestPodsOnly which does much better cleanup. https://github.com/isaaguilar/terraform-operator/releases/tag/v0.8.2

kind: Terraform
metadata:
  name: my-tfo-resource
spec:
  ...
  keepLatestPodsOnly: true

That should clear out old resources and keep only the latest. The ones that got created before will need to be manually cleared unfortunately.

o-orand commented 2 years ago

Thanks ! I've installed latest version and it works better.

o-orand commented 2 years ago

I've noticed that operator is killed due to Out Of Memory, but everything seems fine in log.

   - containerID: containerd://8ebf83d8d5d36bf0828c4f9262fe188d98a1356cffea470c920236a2428443d4                                                                                         
     image: docker.io/isaaguilar/terraform-operator:v0.8.2
     imageID: docker.io/isaaguilar/terraform-operator@sha256:319a86bad4bb657dc06f51f5f094639f37bceca2b0dd3255e5d1354d601270b2
     lastState:
       terminated:
         containerID: containerd://8ebf83d8d5d36bf0828c4f9262fe188d98a1356cffea470c920236a2428443d4
         exitCode: 137
         finishedAt: "2022-06-10T08:32:41Z"
         reason: OOMKilled
         startedAt: "2022-06-10T08:31:56Z"
     name: terraform-operator
     ready: false
     restartCount: 167
     started: false

I will try to increase allocated memory, and see :) Nevertheless, it seems related to number of secrets and configmaps. I've deleted old secrets and configmap but OOM is still here

kubectl get configmaps,secrets |grep "tf-harbor"|cut -d'-' -f1-3|sort|uniq -c
      1 configmap/tf-harbor-internet
      1 configmap/tf-harbor-intranet
      2 secret/tf-harbor-internet
      1 secret/tf-harbor-intranet
isaaguilar commented 2 years ago

I'd be interested in knowing how much memory was allocated, total number of 'tf' resources.

# total tf
kubectl get tf --all-namespaces | wc -l

Maybe also some metrics on total number of pods since the operator has a watch on pod events as well.

o-orand commented 2 years ago

Allocated memory was default value (128M). I've increased allocated memory to 256M, and now, tf operator seems fine. For tf resources it's easy, we only have 2...

For number of pods

kubectl get pods --all-namespaces --no-headers|wc -l
103
o-orand commented 2 years ago

I'm facing another issue making the workaround to fail: the yaml associated to terraform resource is to big, see message below

 2022-06-20T08:21:16.456Z    DEBUG    terraform_controller    failed to update tf status: rpc error: code = ResourceExhausted desc = trying to send message larger than max  (2097279 vs. 2097152)    {"Terraform": "10-harbor-registry/tf-harbor-intranet", "id": "8564c063-e25b-410d-9b55-ca71b50627cf"}                                          

As all generation are keep forever, the yaml keep increasing:

status:
  exported: "false"
  lastCompletedGeneration: 0
  phase: running
  podNamePrefix: tf-harbor-internet-46ja6izd
  stages:
  - generation: 1
    interruptible: false
    podType: setup
    reason: TF_RESOURCE_CREATED
    startTime: "2022-05-20T11:09:48Z"
    state: failed
    stopTime: "2022-05-20T11:09:51Z"
  ...
  ...
  ... 
  - generation: 9897
    interruptible: true
    podType: post
    reason: ""
isaaguilar commented 2 years ago

Thanks @o-orand I knew this would soon be an issue and I haven't thought of a good way to handle it yet. I figured using an existing option, like keepLatestPodsOnly, should clean up status automatically. The downside is that pod status history is sort of lost... users who log k8s events can see pod statuses.

Other ideas, and possibly one I'll investigate (after kids go back to school 😅 ) is using the PVC to store runner status and terraform logs. This data will be formatted to be fed into a tfo dashboard. More on this to come.

For an immediate fix, perhaps we should keep n number of generation status in case someone is using the generation status feature for some reason. I'll continue forming ideas.

infinitydon commented 3 months ago

@isaaguilar - Checking up on this thread, is the above workaround still the only way for periodic reconciliation?

Thanks