Closed ggruening closed 3 months ago
@mysticaltech any idea where the issue is? updates in cloud-init? or Hetzner api? This is also affecting me currently
Will have a look today
Thanks for the debug @ggruening. Appreciate the details.
Folks, I tried on my end with cx21 and cpx11 with both an old image and a newly rebuilt one. In both instances everything works fine.
So please try to debug more, especially in your case @ggruening, for instance, SSH into the server and try running the failed script /tmp/terraform_1260129824.sh
manually. Also inspect its content.
Last but not least, follow all the debug tips in the debug section in the readme, like fetching the journalctl error logs.
Response from gpt-4o, it might help.
Hello @ggruening and @maaft ,
I've been following this thread closely and noticed the issue with cloud-init
failing to parse certain characters in your YAML configuration on openSUSE MicroOS. I'd like to suggest a few steps that might help in diagnosing and potentially resolving this issue:
The error related to an "unacceptable character" in your YAML file suggests a potential character encoding issue or a hidden/special character that isn't easily visible:
Since the environment is based on openSUSE MicroOS, you might need to be cautious about file immutability and system states:
sudo cloud-init clean
sudo cloud-init init
sudo cloud-init modules --mode config
sudo cloud-init modules --mode final
This can help refresh the configuration and apply the settings afresh.
sudo journalctl -u cloud-init
This command will give you a real-time output of what cloud-init
is doing and where it might be failing.
Understanding what data cloud-init is handling could provide insights into potential misconfigurations:
sudo cloud-init query userdata
sudo cloud-init query --all
These steps should provide a comprehensive approach to troubleshooting the issue at hand. If the problem persists, it might be beneficial to isolate the configuration to a simpler setup to determine if the issue is systemic or configuration-specific.
Hope this helps! Looking forward to hearing how it goes.
If you can share the full cloud-init logs and especially the user-data (probably encoded in base64) it would be great.
Hey @mysticaltech (gpt-4o :) ),
So please try to debug more, especially in your case @ggruening, for instance, SSH into the server and try running the failed script
/tmp/terraform_1260129824.sh
manually. Also inspect its content.
There is no /tmp/terraform_1260129824.sh
on the servers. Maybe things were cleaned up when terraform ended?
So please try to debug more
I will. I don't have much time right now, but I'll get back as soon as possible - probably in a few hours. Thanks for thinking along.
Gregor
@mysticaltech thank you for looking into this.
How can I view the logs when I cannot SSH into the node?
I tried with hetnzer rescue mode and mounting the MicroOs Partition (/dev/sda3), but somehow /var folder is completely empty (and therefore I cannot see any logs).
As this is also linked in forum.hetzner ; I tried with terraform 1.8.5 ( which is out of date as the time of writing 1.9.3 is shiny ) and the provided config by @ggruening, which I reformated before testing, and it worked.
also packer v1.11.0 ; kubectl 1.30.3
Another difference was: @ggruening afaik used ssh-agent whereas I provided a non null private ssh key
@maaft @ggruening What terraform / tofu version did you use? Does it help to put your current config to the provided chatgpt and tell it to summarize / or clean up the comments help ( lazy way of reformatting bad charachters ).
@mysticaltech @maaft @cztk,
I'm one step further, I don't know how big it is yet.
I logged into the server, exported the config (cloud-init query --all > cloud-init.query.all
) and then jumped to the faulty location 6908. There was a description (!) of an ssh key stored at Hetzner, which I referenced in the kube.tf using its id. In my case it was Key f\u22r k8s stack
- with a failed encoded german umlaut "ü". That must have thrown it out. @maas, check your descriptions (!) of the ssh keys if necessary.
Unfortunately, terraform keeps crashing, though. This time with:
╷
│ Error: remote-exec provisioner error
│
│ with module.kube-hetzner.null_resource.first_control_plane,
│ on .terraform/modules/kube-hetzner/init.tf line 70, in resource "null_resource" "first_control_plane":
│ 70: provisioner "remote-exec" {
│
│ error executing "/tmp/terraform_1986346640.sh": Process exited with status 127
Something completely different. Any ideas? I'm still looking!
Best regards Gregor
I upgraded my hcloud cli to the latest version and did a new testrun with the config @ggruening provided. ffd I used an old rsa key tested, worked.
I was not able to reproduce your issue :(
I'm using OpenTofu v1.8.1 Same issue with Terraform v1.9.2
But I used these versions 3 weeks ago, exact same kube.tf config and everything worked. Something must have changed in the meantime.
Just retried with rsa key (instead of ed25519) for SSH and updated hcloud cli. Same issue: I cannot SSH into the nodes, although they are online.
What I still find very weird: When I enable resuce mode and mount the MicroOS partition /var is completely empty. Should this be the case?! I have currently no ability to check any logs, which makes debugging obviously very hard. Would be nice if anyone has an idea here so I can be more productive wrt solving the issue :)
hmmm, now at least SSHD is no more crashing for me so I can login to the servers. What I changed:
will try to narrow it down
Also, the rest of the cluster now bootstrapped normally but with one exception:
Somehow one of the agent nodes went straight into emergency mode. I have observed this also multiple times before already when bootstrapping a cluster! Reboot of that node helps.
Now I get a different (but similar) error that also @ggruening has reported:
╷
│ Error: remote-exec provisioner error
│
│ with module.kube-hetzner.null_resource.first_control_plane,
│ on .terraform/modules/kube-hetzner/init.tf line 70, in resource "null_resource" "first_control_plane":
│ 70: provisioner "remote-exec" {
│
│ error executing "/tmp/terraform_459919302.sh": Process exited with status 1
Since I can now SSH into the nodes, I'm going to dig deeper and hopefully find something.
@mysticaltech here is what I get when executing the script in question manually:
control-plane-fsn1-pwv:/tmp # bash terraform_707218234.sh
+ /etc/cloud/rename_interface.sh
+ mkdir -p /etc/rancher/k3s
+ '[' -f /tmp/config.yaml ']'
+ chmod 0600 /etc/rancher/k3s/config.yaml
+ '[' -e /etc/rancher/k3s/k3s.yaml ']'
+ cat
+ set -a
+ source /etc/environment
+ set +a
+ cat
+ cat
+ timeout 180s /bin/sh -c 'while ! ping -c 1 1.1.1.1 >/dev/null 2>&1; do echo "Ready for k3s installation, waiting for a successful connection to the internet..."; sleep 5; done; echo Connected'
Connected
+ curl -sfL https://get.k3s.io
+ INSTALL_K3S_SKIP_START=true
+ INSTALL_K3S_SKIP_SELINUX_RPM=true
+ INSTALL_K3S_CHANNEL=v1.29
+ INSTALL_K3S_EXEC='server '
+ sh -
[INFO] Finding release for channel v1.29
[INFO] Using v1.29.7+k3s1 as release
[INFO] Downloading hash https://github.com/k3s-io/k3s/releases/download/v1.29.7+k3s1/sha256sum-amd64.txt
[INFO] Skipping binary downloaded, installed k3s matches hash
[INFO] Skipping installation of SELinux RPM
[INFO] Skipping /usr/local/bin/kubectl symlink to k3s, already exists
[INFO] Skipping /usr/local/bin/crictl symlink to k3s, already exists
[INFO] Skipping /usr/local/bin/ctr symlink to k3s, already exists
[INFO] Creating killall script /usr/local/bin/k3s-killall.sh
[INFO] Creating uninstall script /usr/local/bin/k3s-uninstall.sh
[INFO] env: Creating environment file /etc/systemd/system/k3s.service.env
[INFO] systemd: Creating service file /etc/systemd/system/k3s.service
[INFO] systemd: Enabling k3s unit
Created symlink '/etc/systemd/system/multi-user.target.wants/k3s.service' → '/etc/systemd/system/k3s.service'.
+ /sbin/semodule -v -i /usr/share/selinux/packages/k3s.pp
Attempting to install module '/usr/share/selinux/packages/k3s.pp':
libsemanage.map_compressed_file: Unable to open /usr/share/selinux/packages/k3s.pp
(No such file or directory).
libsemanage.semanage_direct_install_file: Unable to read file /usr/share/selinux/packages/k3s.pp
(No such file or directory).
/sbin/semodule: Failed on /usr/share/selinux/packages/k3s.pp!
Heureka! In the end it works now!
I discovered two (?) more errors:
In my private ssh key I also had a comment with a German umlaut. I corrected that (but didn't test it again straight away, so I'm not sure if that was really was a problem, too.)
I destroyed the entire stack several times and rebuilt it completely (terraform destroy
and terraform apply -auto-approve
). I had hoped that this would give me a clean, clear test situation. But now it happens that Hetzner sometimes just gives you the same IPs (IPv4) again. The fingerprint of the server is of course different from the one in my local known_hosts
file. This prevents automatic login for fear of a "man-in-the-middle" attack. So I've cleaned up here: ssh-keygen -f "/home/ggruening/.ssh/known_hosts" -R "XXX.XXX.XXX.XXX"
, successful for all my IPs that I had before.
And lo and behold: the script now actually runs through (@cztk I continue to use the ssh-agent, by the way).
[!IMPORTANT] Summary: 1) Avoid comments in ssh-keys with non-ASCII characters (both at Hetzner and locally). 2) Always delete the already known IPs from the known_hosts when the servers are newly set up (even if I'm not quite sure how to find them safely... how am I supposed to know whether I've ever had this or that IP before? Especially since it only becomes known during the setup process... does anyone have an idea?
Phew, that was exhausting. Thank you all (@maaft @mysticaltech @cztk) very much for thinking along with me!
@maaft, please tell us here whether this thought also solves your problem. I'll leave the issue open, feel free to close it.
Gregor
@ggruening unfortunately this did not help.
/sbin/semodule: Failed on /usr/share/selinux/packages/k3s.pp
errors on first control-plane.will wait for @mysticaltech to come back, I'm out of ideas.
1. different ssh port then 22 still completely wrecks everything
@maaft Have you thought about creating a firewall rule for your particular ssh port?
Something like that in your kube.tf might help:
extra_firewall_rules = [
{
description = "For special ssh port"
direction = "in"
protocol = "tcp"
port = "<whatever>"
source_ips = ["0.0.0.0/0", "::/0"]
destination_ips = [] # Won't be used for this rule
}
]
the port that you specify in kube.tf will be automatically configured as open inside the hetzner firewall (yes, I checked that)
When specifying different port than 22, SSHD will fail to start - therefore no SSH access possible. But this was not the case 3 weeks ago and I have 4 clusters in total with non-22 port that are successfully running.
So I stay with my asssumption, that something must have changed in this time that is not in our control.
@maaft
Maybe it is exactly that: the firewall rule is no longer created automatically. I just ran my stack with my original kube.tf and it also works with a different ssh port, but I opened it manually as described above...
no, I meant that I checked if the configured port xxxx was present in Hetzners Firewall (via console.hetzner.cloud) - and it was. So pretty sure, I don't have any FW issues.
Anyway, different error with port 22, so I'll focus on that one first.
Okay, finally got it.
Maybe it was the same ssh known_hosts issue that you experienced.
What I finally did was:
Maybe it was the same ssh known_hosts issue that you experienced.
@maaft Fine. I have highlighted my "solution" in color above. HTH.
Great work people! Thanks for the hand @cztk.
Description
Hello everyone, First of all, I would like to thank you for the fantastic work. That's not just a polite saying, this project is really great!
But I have the problem that the cloud-init.service won't start. The error in journalctl is:
Digging a little deeper you find:
This seems to be the reason why terraform is dying like this:
I followed the instructions in the README and only made the necessary adjustments in the kube.tf (ssh keys, reduced the number of servers), see kube.tf.
What can I do? Where does this "unacceptable character #xdcfc" (see above) comes from?
Best wishes Gregor
Kube.tf file
kube.tf
```terraform locals { # You have the choice of setting your Hetzner API token here or define the TF_VAR_hcloud_token env # within your shell, such as: export TF_VAR_hcloud_token=xxxxxxxxxxx # If you choose to define it in the shell, this can be left as is. # Your Hetzner token can be found in your Project > Security > API Token (Read & Write is required). hcloud_token = "xxxxxxxxxxx" } module "kube-hetzner" { providers = { hcloud = hcloud } hcloud_token = var.hcloud_token != "" ? var.hcloud_token : local.hcloud_token # Then fill or edit the below values. Only the first values starting with a * are obligatory; the rest can remain with their default values, or you # could adapt them to your needs. # * For local dev, path to the git repo # source = "../../kube-hetzner/" # If you want to use the latest master branch # source = "github.com/kube-hetzner/terraform-hcloud-kube-hetzner" # For normal use, this is the path to the terraform registry source = "kube-hetzner/kube-hetzner/hcloud" # You can optionally specify a version number # version = "1.2.0" # Note that some values, notably "location" and "public_key" have no effect after initializing the cluster. # This is to keep Terraform from re-provisioning all nodes at once, which would lose data. If you want to update # those, you should instead change the value here and manually re-provision each node. Grep for "lifecycle". # Customize the SSH port (by default 22) # ssh_port = 2222 # * Your ssh public key ssh_public_key = file("/home/ggruening/.ssh/ggruening+2023-12@hetzner-k3s001.pub") # * Your private key must be "ssh_private_key = null" when you want to use ssh-agent for a Yubikey-like device authentication or an SSH key-pair with a passphrase. # For more details on SSH see https://github.com/kube-hetzner/kube-hetzner/blob/master/docs/ssh.md # ssh_private_key = file("~/.ssh/id_ed25519") ssh_private_key = null # You can add additional SSH public Keys to grant other team members root access to your cluster nodes. # ssh_additional_public_keys = [] # You can also add additional SSH public Keys which are saved in the hetzner cloud by a label. # See https://docs.hetzner.cloud/#label-selector # ssh_hcloud_key_label = "role=admin" # If you use SSH agent and have issues with SSH connecting to your nodes, you can increase the number of auth tries (default is 2) # ssh_max_auth_tries = 10 # If you want to use an ssh key that is already registered within hetzner cloud, you can pass its id. # If no id is passed, a new ssh key will be registered within hetzner cloud. # It is important that exactly this key is passed via `ssh_public_key` & `ssh_private_key` variables. hcloud_ssh_key_id = "22401398" # These can be customized, or left with the default values # * For Hetzner locations see https://docs.hetzner.com/general/others/data-centers-and-connection/ network_region = "eu-central" # change to `us-east` if location is ash # If you want to create the private network before calling this module, # you can do so and pass its id here. For example if you want to use a proxy # which only listens on your private network. Advanced use case. # # NOTE1: make sure to adapt network_ipv4_cidr, cluster_ipv4_cidr, and service_ipv4_cidr accordingly. # If your network is created with 10.0.0.0/8, and you use subnet 10.128.0.0/9 for your # non-k3s business, then adapting `network_ipv4_cidr = "10.0.0.0/9"` should be all you need. # # NOTE2: square brackets! This must be a list of length 1. # # existing_network_id = [hcloud_network.your_network.id] # If you must change the network CIDR you can do so below, but it is highly advised against. # network_ipv4_cidr = "10.0.0.0/8" # Using the default configuration you can only create a maximum of 42 agent-nodepools. # This is due to the creation of a subnet for each nodepool with CIDRs being in the shape of 10.[nodepool-index].0.0/16 which collides with k3s' cluster and service IP ranges (defaults below). # Furthermore the maximum number of nodepools (controlplane and agent) is 50, due to a hard limit of 50 subnets per network, see https://docs.hetzner.com/cloud/networks/faq/. # So to be able to create a maximum of 50 nodepools in total, the values below have to be changed to something outside that range, e.g. `10.200.0.0/16` and `10.201.0.0/16` for cluster and service respectively. # If you must change the cluster CIDR you can do so below, but it is highly advised against. # Never change this value after you already initialized a cluster. Complete cluster redeploy needed! # The cluster CIDR must be a part of the network CIDR! # cluster_ipv4_cidr = "10.42.0.0/16" # If you must change the service CIDR you can do so below, but it is highly advised against. # Never change this value after you already initialized a cluster. Complete cluster redeploy needed! # The service CIDR must be a part of the network CIDR! # service_ipv4_cidr = "10.43.0.0/16" # If you must change the service IPv4 address of core-dns you can do so below, but it is highly advisd against. # Never change this value after you already initialized a cluster. Complete cluster redeploy needed! # The service IPv4 address must be part of the service CIDR! # cluster_dns_ipv4 = "10.43.0.10" # For the control planes, at least three nodes are the minimum for HA. Otherwise, you need to turn off the automatic upgrades (see README). # **It must always be an ODD number, never even!** Search the internet for "split-brain problem with etcd" or see https://rancher.com/docs/k3s/latest/en/installation/ha-embedded/ # For instance, one is ok (non-HA), two is not ok, and three is ok (becomes HA). It does not matter if they are in the same nodepool or not! So they can be in different locations and of various types. # Of course, you can choose any number of nodepools you want, with the location you want. The only constraint on the location is that you need to stay in the same network region, Europe, or the US. # For the server type, the minimum instance supported is cx22. The cax11 provides even better value for money if your applications are compatible with arm64; see https://www.hetzner.com/cloud. # IMPORTANT: Before you create your cluster, you can do anything you want with the nodepools, but you need at least one of each, control plane and agent. # Once the cluster is up and running, you can change nodepool count and even set it to 0 (in the case of the first control-plane nodepool, the minimum is 1). # You can also rename it (if the count is 0), but do not remove a nodepool from the list. # You can safely add or remove nodepools at the end of each list. That is due to how subnets and IPs get allocated (FILO). # The maximum number of nodepools you can create combined for both lists is 50 (see above). # Also, before decreasing the count of any nodepools to 0, it's essential to drain and cordon the nodes in question. Otherwise, it will leave your cluster in a bad state. # Before initializing the cluster, you can change all parameters and add or remove any nodepools. You need at least one nodepool of each kind, control plane, and agent. # ⚠️ The nodepool names are entirely arbitrary, but all lowercase, no special characters or underscore (dashes are allowed), and they must be unique. # If you want to have a single node cluster, have one control plane nodepools with a count of 1, and one agent nodepool with a count of 0. # Please note that changing labels and taints after the first run will have no effect. If needed, you can do that through Kubernetes directly. # Multi-architecture clusters are OK for most use cases, as container underlying images tend to be multi-architecture too. # * Example below: control_plane_nodepools = [ { name = "control-plane-fsn1", server_type = "cx22", location = "fsn1", labels = [], taints = [], count = 1 # swap_size = "2G" # remember to add the suffix, examples: 512M, 1G # zram_size = "2G" # remember to add the suffix, examples: 512M, 1G # kubelet_args = ["kube-reserved=cpu=250m,memory=1500Mi,ephemeral-storage=1Gi", "system-reserved=cpu=250m,memory=300Mi"] # Fine-grained control over placement groups (nodes in the same group are spread over different physical servers, 10 nodes per placement group max): # placement_group = "default" # Enable automatic backups via Hetzner (default: false) # backups = true }, { name = "control-plane-nbg1", server_type = "cx22", location = "nbg1", labels = [], taints = [], count = 1 # Fine-grained control over placement groups (nodes in the same group are spread over different physical servers, 10 nodes per placement group max): # placement_group = "default" # Enable automatic backups via Hetzner (default: false) # backups = true }, { name = "control-plane-hel1", server_type = "cx22", location = "hel1", labels = [], taints = [], count = 1 # Fine-grained control over placement groups (nodes in the same group are spread over different physical servers, 10 nodes per placement group max): # placement_group = "default" # Enable automatic backups via Hetzner (default: false) # backups = true } ] agent_nodepools = [ { name = "agent-small", server_type = "cx22", location = "fsn1", labels = [], taints = [], count = 0 # swap_size = "2G" # remember to add the suffix, examples: 512M, 1G # zram_size = "2G" # remember to add the suffix, examples: 512M, 1G # kubelet_args = ["kube-reserved=cpu=50m,memory=300Mi,ephemeral-storage=1Gi", "system-reserved=cpu=250m,memory=300Mi"] # Fine-grained control over placement groups (nodes in the same group are spread over different physical servers, 10 nodes per placement group max): # placement_group = "default" # Enable automatic backups via Hetzner (default: false) # backups = true }, { name = "agent-large", server_type = "cx32", location = "nbg1", labels = [], taints = [], count = 0 # Fine-grained control over placement groups (nodes in the same group are spread over different physical servers, 10 nodes per placement group max): # placement_group = "default" # Enable automatic backups via Hetzner (default: false) # backups = true }, { name = "storage", server_type = "cx32", location = "fsn1", # Fully optional, just a demo. labels = [ "node.kubernetes.io/server-usage=storage" ], taints = [], count = 0 # In the case of using Longhorn, you can use Hetzner volumes instead of using the node's own storage by specifying a value from 10 to 10000 (in GB) # It will create one volume per node in the nodepool, and configure Longhorn to use them. # Something worth noting is that Volume storage is slower than node storage, which is achieved by not mentioning longhorn_volume_size or setting it to 0. # So for something like DBs, you definitely want node storage, for other things like backups, volume storage is fine, and cheaper. # longhorn_volume_size = 20 # Enable automatic backups via Hetzner (default: false) # backups = true }, # Egress nodepool useful to route egress traffic using Hetzner Floating IPs (https://docs.hetzner.com/cloud/floating-ips) # used with Cilium's Egress Gateway feature https://docs.cilium.io/en/stable/gettingstarted/egress-gateway/ # See the https://github.com/kube-hetzner/terraform-hcloud-kube-hetzner#examples for an example use case. { name = "egress", server_type = "cx22", location = "fsn1", labels = [ "node.kubernetes.io/role=egress" ], taints = [ "node.kubernetes.io/role=egress:NoSchedule" ], floating_ip = true count = 0 }, # Arm based nodes { name = "agent-arm-small", server_type = "cax11", location = "fsn1", labels = [], taints = [], count = 0 }, # For fine-grained control over the nodes in a node pool, replace the count variable with a nodes map. # In this case, the node-pool variables are defaults which can be overridden on a per-node basis. # Each key in the nodes map refers to a single node and must be an integer string ("1", "123", ...). # { # name = "agent-arm-small", # server_type = "cax11", # location = "fsn1", # labels = [], # taints = [], # nodes = { # "1" : { # location = "nbg1" # labels = [ # "testing-labels=a1", # ] # }, # "20" : { # labels = [ # "testing-labels=b1", # ] # } # } # }, ] # Add custom control plane configuration options here. # E.g to enable monitoring for etcd, proxy etc: # control_planes_custom_config = { # etcd-expose-metrics = true, # kube-controller-manager-arg = "bind-address=0.0.0.0", # kube-proxy-arg ="metrics-bind-address=0.0.0.0", # kube-scheduler-arg = "bind-address=0.0.0.0", # } # You can enable encrypted wireguard for the CNI by setting this to "true". Default is "false". # FYI, Hetzner says "Traffic between cloud servers inside a Network is private and isolated, but not automatically encrypted." # Source: https://docs.hetzner.com/cloud/networks/faq/#is-traffic-inside-hetzner-cloud-networks-encrypted # It works with all CNIs that we support. # Just note, that if Cilium with cilium_values, the responsibility of enabling of disabling Wireguard falls on you. # enable_wireguard = true # * LB location and type, the latter will depend on how much load you want it to handle, see https://www.hetzner.com/cloud/load-balancer load_balancer_type = "lb11" load_balancer_location = "fsn1" # Disable IPv6 for the load balancer, the default is false. # load_balancer_disable_ipv6 = true # Disables the public network of the load balancer. (default: false). # load_balancer_disable_public_network = true # Specifies the algorithm type of the load balancer. (default: round_robin). # load_balancer_algorithm_type = "least_connections" # Specifies the interval at which a health check is performed. Minimum is 3s (default: 15s). # load_balancer_health_check_interval = "5s" # Specifies the timeout of a single health check. Must not be greater than the health check interval. Minimum is 1s (default: 10s). # load_balancer_health_check_timeout = "3s" # Specifies the number of times a health check is retried before a target is marked as unhealthy. (default: 3) # load_balancer_health_check_retries = 3 ### The following values are entirely optional (and can be removed from this if unused) # You can refine a base domain name to be use in this form of nodename.base_domain for setting the reserve dns inside Hetzner # base_domain = "mycluster.example.com" # Cluster Autoscaler # Providing at least one map for the array enables the cluster autoscaler feature, default is disabled # By default we set a compatible version with the default initial_k3s_channel, to set another one, # have a look at the tag value in https://github.com/kubernetes/autoscaler/blob/master/charts/cluster-autoscaler/values.yaml # ⚠️ Based on how the autoscaler works with this project, you can only choose either x86 instances or ARM server types for ALL autoscaler nodepools. # If you are curious, it's ok to have a multi-architecture cluster, as most underlying container images are multi-architecture too. # # ⚠️ Setting labels and taints will only work on cluster-autoscaler images versions released after > 20 October 2023. Or images built from master after that date. # # * Example below: # autoscaler_nodepools = [ # { # name = "autoscaled-small" # server_type = "cx32" # location = "fsn1" # min_nodes = 0 # max_nodes = 5 # labels = { # "node.kubernetes.io/role": "peak-workloads" # } # taints = # [{ # key: "node.kubernetes.io/role" # value: "peak-workloads" # effect: "NoExecute" # }] # # kubelet_args = ["kube-reserved=cpu=250m,memory=1500Mi,ephemeral-storage=1Gi", "system-reserved=cpu=250m,memory=300Mi"] # } # ] # ⚠️ Deprecated, will be removed after a new Cluster Autoscaler version has been released which support the new way of setting labels and taints. See above. # Add extra labels on nodes started by the Cluster Autoscaler # This argument is not used if autoscaler_nodepools is not set, because the Cluster Autoscaler is installed only if autoscaler_nodepools is set # autoscaler_labels = [ # "node.kubernetes.io/role=peak-workloads" # ] # Add extra taints on nodes started by the Cluster Autoscaler # This argument is not used if autoscaler_nodepools is not set, because the Cluster Autoscaler is installed only if autoscaler_nodepools is set # autoscaler_taints = [ # "node.kubernetes.io/role=specific-workloads:NoExecute" # ] # Configuration of the Cluster Autoscaler binary # # These arguments and variables are not used if autoscaler_nodepools is not set, because the Cluster Autoscaler is installed only if autoscaler_nodepools is set. # # Image and version of Kubernetes Cluster Autoscaler for Hetzner Cloud: # - cluster_autoscaler_image: Image of Kubernetes Cluster Autoscaler for Hetzner Cloud to be used. # - cluster_autoscaler_version: Version of Kubernetes Cluster Autoscaler for Hetzner Cloud. Should be aligned with Kubernetes version. # # Logging related arguments are managed using separate variables: # - cluster_autoscaler_log_level: Controls the verbosity of logs (--v), the value is from 0 to 5, default is 4, for max debug info set it to 5. # - cluster_autoscaler_log_to_stderr: Determines whether to log to stderr (--logtostderr). # - cluster_autoscaler_stderr_threshold: Sets the threshold for logs that go to stderr (--stderrthreshold). # # Server/node creation timeout variable: # - cluster_autoscaler_server_creation_timeout: Sets the timeout (in minutes) until which a newly created server/node has to become available before giving up and destroying it (defaults to 15, unit is minutes) # # Example: # # cluster_autoscaler_image = "registry.k8s.io/autoscaling/cluster-autoscaler" # cluster_autoscaler_version = "v1.30.1" # cluster_autoscaler_log_level = 4 # cluster_autoscaler_log_to_stderr = true # cluster_autoscaler_stderr_threshold = "INFO" # cluster_autoscaler_server_creation_timeout = 15 # Additional Cluster Autoscaler binary configuration # # cluster_autoscaler_extra_args can be used for additional arguments. The default is an empty array. # # Please note that following arguments are managed by terraform-hcloud-kube-hetzner or the variables above and should not be set manually: # - --v=${var.cluster_autoscaler_log_level} # - --logtostderr=${var.cluster_autoscaler_log_to_stderr} # - --stderrthreshold=${var.cluster_autoscaler_stderr_threshold} # - --cloud-provider=hetzner # - --nodes ... # # See the Cluster Autoscaler FAQ for the full list of arguments: https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#what-are-the-parameters-to-ca # # Example: # # cluster_autoscaler_extra_args = [ # "--ignore-daemonsets-utilization=true", # "--enforce-node-group-min-size=true", # ] # Enable delete protection on compatible resources to prevent accidental deletion from the Hetzner Cloud Console. # This does not protect deletion from Terraform itself. # enable_delete_protection = { # floating_ip = true # load_balancer = true # volume = true # } # Enable etcd snapshot backups to S3 storage. # Just provide a map with the needed settings (according to your S3 storage provider) and backups to S3 will # be enabled (with the default settings for etcd snapshots). # Cloudflare's R2 offers 10GB, 10 million reads and 1 million writes per month for free. # For proper context, have a look at https://docs.k3s.io/datastore/backup-restore. # You also can use additional parameters from https://docs.k3s.io/cli/etcd-snapshot, such as `etc-s3-folder` # etcd_s3_backup = { # etcd-s3-endpoint = "xxxx.r2.cloudflarestorage.com" # etcd-s3-access-key = "Screenshots
No response
Platform
Linux