Closed schlichtanders closed 1 year ago
@schlichtanders Before today, what version where you using? 2.6.1 has been unchanged for a week.
You want to make sure your project is clean before starting, especially if using the same cluster name. To make sure it's clean, just use our cleanupkh helper script, see readme.
I was using version 2.2.0 on time before switching to master
I have now deleted and cleanedup everything again - the described error is still there
@schlichtanders Ok, thanks for the info, I will give it a shot ASAP then. Please, have a look at the k3s logs, SSH to the failing host where the kustomization is running, and debug the issue if you can, see the debug section in the readme, that would help a lot.
Btw, the cleanupkh
script is supposed to delete everything including the LB. I have never seen it fail at that. But it must be run, at the point where you the the terraform destroy stalling, you open a new terminal tab and run it in parallel.
@schlichtanders I just deployed my test cluster with v2.6.1 without issues. I suggest you check the content of your config.yaml. And try running the kustomization script manually. You will see the location in your failed logs, usefully terraform creates a tmp file and gives you the place were it's running so you can invoke it manually after SSH'ing into the first control plane.
Then you can check the value of the config.yaml located in /etc/rancher/k3s, and also inspect the status and logs of the k3s service using respectively servicectl and journalctl.
So in other words, the issue is not with the branch, but probably with the new code. Keeping this open for now so we can debug more together.
thank you so much for your support, it is always the first most important step to know that it is working for someone else.
I now got it running through. What I changed:
Very impressive that an older version of CSI, or whatever it was, lead to this behaviour.
Closing this issue and working further on the k3s_token branch now :)
Btw, the
cleanupkh
script is supposed to delete everything including the LB. I have never seen it fail at that. But it must be run, at the point where you the the terraform destroy stalling, you open a new terminal tab and run it in parallel.
I found out the reason: I created a test cluster for testing the backup restoration. The cleanup script demands that the right hcloud context is activated, which I forgot (it was still referring to the old hcloud context, luckily I changed the cluster name, which prevented me accidentally destroying my whole production cluster :smiling_face_with_tear: )
Btw, the
cleanupkh
script is supposed to delete everything including the LB. I have never seen it fail at that. But it must be run, at the point where you the the terraform destroy stalling, you open a new terminal tab and run it in parallel.I found out the reason: I created a test cluster for testing the backup restoration. The cleanup script demands that the right hcloud context is activated, which I forgot (it was still referring to the old hcloud context, luckily I changed the cluster name, which prevented me accidentally destroying my whole production cluster :smiling_face_with_tear: )
Actively verifying and displaying the context name, that's an important thing to add to that script in the future, thanks for sharing! 🙏
Description
Hello, I just tried the current master (which is also released as 2.6.1) and cannot create a cluster any longer it outputs
It never took so long to create the kustomization, hence I guess something is wrong
Kube.tf file
Screenshots
No response
Platform
Linux