vitobotta / hetzner-k3s

The easiest and quickest way to create and manage Kubernetes clusters in Hetzner Cloud using the lightweight distribution k3s by Rancher.
MIT License
1.56k stars 107 forks source link

Hangs on Validating Configuration #320

Open psavva opened 4 months ago

psavva commented 4 months ago

Hi @vitobotta

I'm having an issue with my setup where it's just hanging on "Validating configuration"

Is there any way to increase logging verbosity to understand any underlying issues?

I have both the public and private keys generated ok in the ssh folder


---
hetzner_token: tokenhasbeenremovedforsecuirtypurposes
cluster_name: Groovit
kubeconfig_path: "./kubeconfig"
k3s_version: v1.29.0+k3s1
public_ssh_key_path: "./ssh/id_ed25519.pub"
private_ssh_key_path: "./ssh/id_ed25519"
use_ssh_agent: false
ssh_allowed_networks:
  - 0.0.0.0/0 # ensure your current IP is included in the range
api_allowed_networks:
  - 0.0.0.0/0
schedule_workloads_on_masters: true
private_network_subnet: 10.0.0.0/16
enable_encryption: true
disable_flannel: false
cloud_controller_manager_manifest_url: "https://raw.githubusercontent.com/hetznercloud/hcloud-cloud-controller-manager/v1.18.0/deploy/ccm-networks.yaml"
csi_driver_manifest_url: "https://raw.githubusercontent.com/hetznercloud/csi-driver/v2.5.1/deploy/kubernetes/hcloud-csi.yml"
system_upgrade_controller_manifest_url: "https://raw.githubusercontent.com/rancher/system-upgrade-controller/master/manifests/system-upgrade-controller.yaml"
masters_pool:
  instance_type: cpx11
  location: nbg1
  instance_count: 3
worker_node_pools:
- name: autoscaled-nbg1-cx11
  instance_type: cpx11
  location: nbg1
  instance_count: 0
  autoscaling:
    enabled: true
    min_instances: 0
    max_instances: 2
psavva commented 4 months ago

The issue seems to have self resolved, howerver, it think it would still be a good improvement if we can implement verbosity

psavva commented 2 months ago

@vitobotta I'm reopening this issue, as i'm facing the issue again where the script hangs validating resources.

Please could you help me debug this issue?

psavva commented 2 months ago

@vitobotta I can now confirm it has continued, which took about 10 mintues just on the validation. I think we need to check the verbosity of the logging, and perhaps understand why it takes long?

vitobotta commented 2 months ago

Hi, have you tried enabling he SSH agent as described in the README?

psavva commented 2 months ago

Hi,

I'm not using the ssh agent at all.

vitobotta commented 2 months ago

Hi,

I'm not using the ssh agent at all.

And I am suggesting you try with the agent, if the problem is related to SSH auth hanging. I will try to enable verbose output in the next release.

vitobotta commented 2 months ago

Is there a passphrase on the key?

psavva commented 2 months ago

There is no passphrase on the key.

It progresses, takes a while.

I think it would be great if there is a way to enable verbosity and understand where the slowdown originates

On Wed, Apr 24, 2024, 22:34 Vito Botta @.***> wrote:

Is there a passphrase on the key?

— Reply to this email directly, view it on GitHub https://github.com/vitobotta/hetzner-k3s/issues/320#issuecomment-2075696094, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALDFJXEY2XAJH57PCRRUG3Y7ACMPAVCNFSM6AAAAABDPKYW36VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANZVGY4TMMBZGQ . You are receiving this because you modified the open/close state.Message ID: @.***>

vitobotta commented 2 months ago

There is no passphrase on the key. It progresses, takes a while. I think it would be great if there is a way to enable verbosity and understand where the slowdown originates On Wed, Apr 24, 2024, 22:34 Vito Botta @.> wrote: Is there a passphrase on the key? — Reply to this email directly, view it on GitHub <#320 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALDFJXEY2XAJH57PCRRUG3Y7ACMPAVCNFSM6AAAAABDPKYW36VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANZVGY4TMMBZGQ . You are receiving this because you modified the open/close state.Message ID: @.>

I'll try to add it to the next release if I can. So it did continue? How long was it kinda "paused"? Sometimes it can also be a slowdown with the Hetzner API. The new version not yet release already has improved output that tells you more clearly what each output line is about.

psavva commented 2 months ago

Thank you for this. It took over 10 minutes

vitobotta commented 2 months ago

Thank you for this. It took over 10 minutes

Interesting. Which region are you using? Can you reproduce the same slowness in another region? I have been creating large clusters often these past days and didn't happen to me. I even created a 200-node cluster in 4 minutes with the new version :)

psavva commented 2 months ago

I'll give it a try and report back.

The cluster is in Nuremberg, only 3 master nodes.

I'll try from different workstations too. Maybe it's tied to my workstation in some way...

vitobotta commented 2 months ago

Thanks! Looking forward to hearing how it goes. I just started creating a 300 node cluster :D