xetys / hetzner-kube

A CLI tool for provisioning kubernetes clusters on Hetzner Cloud
Apache License 2.0
745 stars 116 forks source link

Failing to add additional workers after initial cluster creation #119

Open UlliBe opened 6 years ago

UlliBe commented 6 years ago

Hi,

cluster creation worked perfectly, but adding cluster workers failed with the error below. Yes, using windows for installing ;-) Not sure if thats the cause, i didn't get around to test on linux yet.

Regards

Torgon

C:\Users\username\go\bin\hetzner-kube --name clustername cluster add-worker --nodes 1 --worker-server-type cx21 --datacenters nbg1-dc3 2018/05/07 14:30:29 creating server 'clustername-worker-02'... --- [======================================] 100% 2018/05/07 14:30:44 Created node 'clustername-worker-02' with IP 195.201.36.59 panic: runtime error: slice bounds out of range

goroutine 1 [running]: github.com/xetys/hetzner-kube/pkg/clustermanager.PrivateIPPrefix(0x0, 0x0, 0xc042457680, 0x2) C:/Users/username/go/src/github.com/xetys/hetzner-kube/pkg/clustermanager/wireguard.go:61 +0xb4 github.com/xetys/hetzner-kube/pkg/hetzner.(Provider).CreateNodes(0xc042088680, 0x8fdb8a, 0x6, 0x0, 0x0, 0xc042020150, 0x4, 0x0, 0x0, 0x0, ...) C:/Users/username/go/src/github.com/xetys/hetzner-kube/pkg/hetzner/hetzner_provider.go:104 +0x671 github.com/xetys/hetzner-kube/pkg/hetzner.(Provider).CreateWorkerNodes(0xc042088680, 0xc04216f6f0, 0xd, 0xc042020150, 0x4, 0xc0420575c0, 0x1, 0x1, 0x1, 0x1, ...) C:/Users/username/go/src/github.com/xetys/hetzner-kube/pkg/hetzner/hetzner_provider.go:140 +0x152 github.com/xetys/hetzner-kube/cmd.glob..func5(0xbb3880, 0xc042088600, 0x0, 0x8) C:/Users/username/go/src/github.com/xetys/hetzner-kube/cmd/cluster_add_worker.go:100 +0x5f5 github.com/xetys/hetzner-kube/vendor/github.com/spf13/cobra.(Command).execute(0xbb3880, 0xc042088580, 0x8, 0x8, 0xbb3880, 0xc042088580) C:/Users/username/go/src/github.com/xetys/hetzner-kube/vendor/github.com/spf13/cobra/command.go:702 +0x2cd github.com/xetys/hetzner-kube/vendor/github.com/spf13/cobra.(Command).ExecuteC(0xbb5a80, 0xb6d840, 0x0, 0x0) C:/Users/username/go/src/github.com/xetys/hetzner-kube/vendor/github.com/spf13/cobra/command.go:783 +0x2eb github.com/xetys/hetzner-kube/vendor/github.com/spf13/cobra.(*Command).Execute(0xbb5a80, 0x8fc6c2, 0x5) C:/Users/username/go/src/github.com/xetys/hetzner-kube/vendor/github.com/spf13/cobra/command.go:736 +0x32 github.com/xetys/hetzner-kube/cmd.Execute() C:/Users/username/go/src/github.com/xetys/hetzner-kube/cmd/root.go:51 +0x85 main.main() C:/Users/username/go/src/github.com/xetys/hetzner-kube/main.go:20 +0x27

xetys commented 6 years ago

congrats, you are the first user who tried the cross-platform version.

Are you using go also for development? Dou you feel confident to PR an fix to that win related bug?

UlliBe commented 6 years ago

Well, i'm usually not a go-dev but i might try to have a look if i can figure out whats happening here. Its creating the server, but looks as if its not able to parse the return values..

UlliBe commented 6 years ago

Ok, so far i was able to trace the problem back to an empty node_cidr value in the hetzner-kube config.json After filling it with (presumed) correct value for the internal ip subnet 10.x.x.x the process gets beyond the previous breaking point, now seemingly processing "wireguard configured" on the previous nodes for quite a while.... Might be related to #99

xetys commented 6 years ago

but looks like a general problem, not windows specific

UlliBe commented 6 years ago

Well, i'm still on windows (though same problem using ubuntu bash on windows) I tried using the add-external-host with an IP too, same nodecidr error What is the expected value there? Seems to be related to your last change from about a week ago?

xetys commented 6 years ago

The node CIDR is basically meant for you to be able to specify other internal subnets than 10.0.1.0/24. I believe it is not properly persisted in the config.json, so this might be the issue.

UlliBe commented 6 years ago

So you expect 10.0.1.0/24 in that json field as a default?

UlliBe commented 6 years ago

Because, after inserting 10.0.1.1 into that field, i got to this point:

xxx@dell:~$ /mnt/d/Dev/go/bin/hetzner-kube cluster add-worker --datacenters nbg1-dc3 --name clustername --nodes 1 --worker-server-type cx21 2018/05/09 11:28:17 creating server 'clustername-worker-02'... --- [======================================] 100% 2018/05/09 11:28:35 Created node 'clustername-worker-02' with IP 195.201.36.59 2018/05/09 11:28:35 sleep for 30s... ...stername-master-0 : wireguard configured 7.4% [--------------] ...stername-master-0 : wireguard configured 8.3% [--------------] ...stername-master-0 : wireguard configured 8.3% [--------------] ...stername-worker-0 : wireguard configured 13.3% [>-------------]

No further change for the last 15 minutes though....

xetys commented 6 years ago

yes. It should be saved into the config.json....But it looks like it's not

UlliBe commented 6 years ago

Thanks for all your help, btw, your tool was a great find for me and very helpful in setting the cluster up!

xetys commented 6 years ago

It's always a pleasure when people find my work useful :)

quorak commented 6 years ago

Can confirm the same problem with linux.

loxy commented 6 years ago

Same for me. Hanging on wireguard configured...

2018/06/07 22:54:16 creating server 'bettles-worker-04'...
  --- [======================================] 100%
2018/06/07 22:54:31 Created node 'bettles-worker-04' with IP 159.69.19.51
2018/06/07 22:54:31 sleep for 30s...
bettles-master-01    : wireguard configured                7.4% [--------------]
bettles-master-02    : wireguard configured                8.3% [--------------]
bettles-master-03    : wireguard configured                8.3% [--------------]
bettles-worker-01    : wireguard configured               13.3% [>-------------]
bettles-worker-02    : wireguard configured               13.3% [>-------------]
bettles-worker-03    : wireguard configured               13.3% [>-------------]

Does someone know when it occures?

vschwaberow commented 6 years ago

Problem also occurs on OSX.

trippinCode commented 6 years ago

I have the same problem on Linux.

root@xxxx:/some/path/# hetzner-kube cluster add-worker --name test-cluster --nodes 1
Enter passphrase for SSH key /root/.ssh/id_rsa: 
2018/06/29 11:27:18 creating server 'test-cluster-worker-02'...
  --- [======================================] 100%
2018/06/29 11:27:43 Created node 'test-cluster-worker-02' with IP 159.69.48.138
2018/06/29 11:27:43 sleep for 30s...
...-cluster-master-0 : wireguard configured                8.7% [--------------]
...-cluster-worker-0 : wireguard configured               14.3% [>-------------]
^C <- left it for 25-30 min no changes, cpu load between 0% - 5%  
root@xxxx:/some/path/# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 18.04 LTS
Release:    18.04
Codename:   bionic
JohnnyQQQQ commented 6 years ago

Somehow your worker ist running Ubuntu 18.04, it's not supported for now.

Lukas notifications@github.com schrieb am Fr., 29. Juni 2018, 12:19:

I have the same problem on Linux.

root@xxxx:/some/path/# hetzner-kube cluster add-worker --name test-cluster --nodes 1 Enter passphrase for SSH key /root/.ssh/id_rsa: 2018/06/29 11:27:18 creating server 'test-cluster-worker-02'... --- [======================================] 100% 2018/06/29 11:27:43 Created node 'test-cluster-worker-02' with IP 159.69.48.138 2018/06/29 11:27:43 sleep for 30s... ...-cluster-master-0 : wireguard configured 8.7% [--------------] ...-cluster-worker-0 : wireguard configured 14.3% [>-------------] ^C <- left it for 25-30 min no changes, cpu load between 0% - 5% root@xxxx:/some/path/# lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 18.04 LTS Release: 18.04 Codename: bionic

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/xetys/hetzner-kube/issues/119#issuecomment-401313899, or mute the thread https://github.com/notifications/unsubscribe-auth/ADXWGpEpVza-SoA9IDH3cRoTRWiqfyIWks5uBf8kgaJpZM4T055C .

trippinCode commented 6 years ago

@JohnnyQQQQ That is not the worker that is my pc where the hetzner-kube cli is running on.

JohnnyQQQQ commented 6 years ago

What Version are you using? On master this bug should be fixed.

Lukas notifications@github.com schrieb am Fr., 29. Juni 2018, 13:49:

@JohnnyQQQQ https://github.com/JohnnyQQQQ That is not the worker that is my pc where the hetzner-kube cli is running on.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/xetys/hetzner-kube/issues/119#issuecomment-401331161, or mute the thread https://github.com/notifications/unsubscribe-auth/ADXWGm-RnzpQ4XfRHFOgrmJsGlQA-nBHks5uBhQ5gaJpZM4T055C .

trippinCode commented 6 years ago

I'm using version 0.3, do i need to build from source?

JohnnyQQQQ commented 6 years ago

Yes, it's a Bug in 0.3

Lukas notifications@github.com schrieb am Fr., 29. Juni 2018, 14:01:

I'm using version 0.3, do i need to build from source?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/xetys/hetzner-kube/issues/119#issuecomment-401333667, or mute the thread https://github.com/notifications/unsubscribe-auth/ADXWGjlFUG5-CTmiduz5F9pSqAnb8DcFks5uBhchgaJpZM4T055C .

trippinCode commented 6 years ago

Ok thanks :dog2:

quorak commented 6 years ago

still does not work with 0.3.1 stable

mavimo commented 6 years ago

this issue is related to (fixed in master) https://github.com/xetys/hetzner-kube/commit/b1e900c1a5718525b66fb8e4445f9bed08e55dfe

User that need to migrate should fix the ~/.hetzner-kube/config.json by adding the entry "node_cidr": "10.0.1.0" to the cluster definition, like:

 {
   "active_context_name": "your-context-name",
   "contexts": [ /* */ ],
   "ssh_keys": [ /* */ ],
   "clusters": [
     {
        "name": "your-cluster-name",
        /* */
+       "node_cidr": "10.0.1.0"
     }
   ]
 }