Open enter-marlah opened 1 month ago
Hi, can you share your full config file (minus the token)?
---
cluster_name: kube-prod
kubeconfig_path: "./kubeconfig"
k3s_version: v1.29.3+k3s1
include_instance_type_in_instance_name: true
networking:
ssh:
port: 22
use_agent: false
public_key_path: "./id_rsa_hetzner_prod.pub"
private_key_path: "./id_rsa_hetzner_prod"
allowed_networks:
ssh:
- 0.0.0.0/0
api:
- 0.0.0.0/0
public_network:
ipv4: false
ipv6: false
private_network:
enabled : true
subnet: 10.0.0.0/16
existing_network_name: "KubeNet"
cni:
enabled: true
encryption: false
mode: flannel
manifests:
cloud_controller_manager_manifest_url: "https://github.com/hetznercloud/hcloud-cloud-controller-manager/releases/download/v1.20.0/ccm-networks.yaml"
csi_driver_manifest_url: "https://raw.githubusercontent.com/hetznercloud/csi-driver/v2.8.0/deploy/kubernetes/hcloud-csi.yml"
system_upgrade_controller_deployment_manifest_url: "https://github.com/rancher/system-upgrade-controller/releases/download/v0.13.4/system-upgrade-controller.yaml"
system_upgrade_controller_crd_manifest_url: "https://github.com/rancher/system-upgrade-controller/releases/download/v0.13.4/crd.yaml"
cluster_autoscaler_manifest_url: "https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/hetzner/examples/cluster-autoscaler-run-on-master.yaml"
datastore:
mode: etcd
external_datastore_endpoint: postgres://....
schedule_workloads_on_masters: false
image: ubuntu-22.04
masters_pool:
instance_type: cpx11
instance_count: 3
location: hel1
worker_node_pools:
- name: med-static
instance_type: cpx31
instance_count: 3
location: hel1
autoscaling:
enabled: true
min_instances: 0
max_instances: 6
embedded_registry_mirror:
enabled: true
post_create_commands:
- echo 'network:\n version:\ 2\n ethernets:\n enp7s0:\n critical:\ true\n nameservers:\n addresses:\ [10.0.0.2]\n routes:\n - on-link:\ true\n to:\ 0.0.0.0/0\n via:\ 10.0.0.1' > /etc/netplan/50-cloud-init.yaml
- sed -i 's/\\//g' /etc/netplan/50-cloud-init.yaml
- sed -i 's/^nameserver.*/nameserver 10.0.0.2/' /etc/resolv.conf
- netplan apply
- apt update
- apt upgrade -y
- apt autoremove -y
Hi, this has been reported a couple of times before but I haven't had a chance to try and reproduce the problem yet. Can you share more details on how you have configured the network in Hetzner? The more details the better as they might help me understand where the problem might be.
Hi, I have the same problem. Is there any known config to work? my network config is same as static workers, but after autoscaling and waiting some minutes, I cannot ssh into the autoscaled worker and it is not joining the cluster.
- name: medium-autoscaled
instance_type: cpx31
instance_count: 1
location: hel1
autoscaling:
enabled: true
min_instances: 0
max_instances: 4
image: debian-12
additional_packages:
- ifupdown
post_create_commands:
- ip route add default via 10.100.0.1 # Adapt this to your gateway IP
- echo "nameserver 185.12.64.1" > /etc/resolv.conf
Hi, I have the same problem. Is there any known config to work? my network config is same as static workers, but after autoscaling and waiting some minutes, I cannot ssh into the autoscaled worker and it is not joining the cluster.
- name: medium-autoscaled instance_type: cpx31 instance_count: 1 location: hel1 autoscaling: enabled: true min_instances: 0 max_instances: 4 image: debian-12 additional_packages: - ifupdown post_create_commands: - ip route add default via 10.100.0.1 # Adapt this to your gateway IP - echo "nameserver 185.12.64.1" > /etc/resolv.conf
The problem is, the few reports I've come across about these issues all involve some custom commands to tweak the network settings, and that's something I haven't checked yet. So far, with the default network configuration, I haven't been able to recreate any of those problems.
so it might have to do with "private network only" setup? - because thats my only real difference - using a NAT routing VM for Internet access from inside the cluster.
I can't be sure because I haven't had a chance to verify this, but that's my suspicion at the moment.
Hello!
We are running Hetzner-k3s version 2.0.8 with the following worker pool config:
The nodes are created in Hetzner after autoscaling is initiated by stressing the cluster but they are not joining the cluster after that. We can ssh into the machines but they don't have for example ssh keys set or anything related to k3s installed. For static nodes the ssh keys are set correctly.
We think this has something to do with the previous cloud init wait problem in issue https://github.com/vitobotta/hetzner-k3s/issues/379
If we read the code correctly the cloud_init_wait.sh script is not called when creating a autoscaled node?
We are running a private network only cluster. Regarding to this PR https://github.com/vitobotta/hetzner-k3s/pull/458 our cloud init takes several minutes with both static and autoscaled nodes.