flux-iac / tofu-controller

A GitOps OpenTofu and Terraform controller for Flux
https://flux-iac.github.io/tofu-controller/
Apache License 2.0
1.3k stars 137 forks source link

tf-runner pod won't terminate #1436

Open cunningr opened 2 months ago

cunningr commented 2 months ago

I'm just trying out the tofu-controller and have installed the following alongside Flux v1.2.3:

tofu-controller:v0.16.0-rc.4
tf-runner:v0.16.0-rc.4

I am trying some very basic operations using the ready to use aws package, in essence just trying to create an S3 bucket.

I can get the bucket to create ok but after the apply, the runner is stuck in Terminating:

NAME                                       READY   STATUS        RESTARTS   AGE
aws-s3-bucket-tf-runner                    0/1     Terminating   0          48m

It seems to stay like this indefinitely. Then when I try to delete the Terraform resource (kubectl delete terraform my-s3-bucket) the tf-contrller logs show that it stuck waiting for the previous running to terminate. If I delete the runner manually it does eventually proceed (I guess other bad things are expected to happen from this point after a manual deletion).

ilithanos commented 1 month ago

Thanks for reporting this issue, not sure why it is happening though.

Any chance you have some specific replication steps, to be able to debug this?

I havn't ever experienced the runners hanging in a Terminating state, so i would like to find the reason this is happening to you.