error="current host not found in the list of nodes" when trying local installation

zjevik commented 6 years ago

Hi, I'm trying to instal locally on Ubuntu 18.04 and I get the error "ERRO[0001] Failed to run error="current host not found in the list of nodes"". I run the following commands:

k8s-tew initialize
k8s-tew configure --email joe.doe@gmail.com --ingress-domain domain.com
k8s-tew node-add -s
k8s-tew generate --parallel
sudo k8s-tew run

My IP address is in 192.168.1.0/24 but in node-list I see a node with IP address 192.168.100.50.

I tried to go over the documentation but I couldn't figure out what's wrong. Thanks!

darxkies commented 6 years ago

Please post the output of the following commands here:

k8s-tew node-list hostname ifconfig

Did you run k8s-tew directly on your computer? Or in another environment?

Additionally, while using -s with node-add you can still use -n and -i to specify the hostname and the IP address, in case k8s-tew picked the wrong settings.

zjevik commented 6 years ago

The system (fresh installation of Ubuntu Server 18.04) is in Hyper-V 2nd gen virtual machine.

I tried to remove the node from the node-list and add it using k8s-tew node-add -n webwork -i 192.168.1.32 -x 1 -l "controller worker" but then k8s-tew generate displays the following error ERRO[0003] Generate failed error="No API Server IP found"

Thanks for your help!

Here are the outputs if I use node-add -s:

user@webwork:~$ k8s-tew node-list
INFO[0003] Node                                          index=0 ip=192.168.100.50 labels="[controller worker]" name=single-node

user@webwork:~$ hostname
webwork

user@webwork:~$ ifconfig 
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.1.32  netmask 255.255.255.0  broadcast 192.168.1.255
        inet6 fe80::215:5dff:fe01:1008  prefixlen 64  scopeid 0x20<link>
        ether 00:15:5d:01:10:08  txqueuelen 1000  (Ethernet)
        RX packets 643463  bytes 901968641 (901.9 MB)
        RX errors 0  dropped 829  overruns 0  frame 0
        TX packets 309355  bytes 24257994 (24.2 MB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 284  bytes 23370 (23.3 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 284  bytes 23370 (23.3 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

darxkies commented 6 years ago

Due to some recent changes -s is broken.

Until it is fixed, you have to use the following command:

k8s-tew node-add -n webwork -i 192.168.1.32 -l bootstrapper,controller,worker

Thank you for reporting it.

zjevik commented 6 years ago

It helped but now when I run sudo k8s-tew run it gets stuck in an infinite loop of restarting servers.

darxkies commented 6 years ago

The log files for the components (etcd, containerd, kube-apiserver and so on) are in assets/var/log/k8s-tew. The files should help finding the reason for the restarts. If you run assets/opt/k8s-tew/bin/etcd/etcd what is the output?

Additionally, to the previous command, you also need to execute this one:

k8s-tew configure --public-network=192.168.1.0/24

zjevik commented 6 years ago

It seems that kubectl won't start. When I interrupt the sudo k8s-tew run I get the following error: INFO[0020] Stopped all servers INFO[0021] Cleaning up children ERRO[0021] Command failed command="/home/ondra/assets/opt/k8s-tew/bin/k8s/kubectl --request-timeout 30s --kubeconfig /home/ondra/assets/etc/k8s-tew/k8s/kubeconfig/admin.kubeconfig apply -f /home/ondra/assets/etc/k8s-tew/k8s/setup/kubelet-setup.yaml" error="Command '/home/ondra/assets/opt/k8s-tew/bin/k8s/kubectl --request-timeout 30s --kubeconfig /home/ondra/assets/etc/k8s-tew/k8s/kubeconfig/admin.kubeconfig apply -f /home/ondra/assets/etc/k8s-tew/k8s/setup/kubelet-setup.yaml' failed with error 'exit status 1' (Output: unable to recognize \"/home/ondra/assets/etc/k8s-tew/k8s/setup/kubelet-setup.yaml\": Get https://192.168.1.32:16443/api?timeout=30s: dial tcp 192.168.1.32:16443: connect: connection refused\nunable to recognize \"/home/ondra/assets/etc/k8s-tew/k8s/setup/kubelet-setup.yaml\": Get https://192.168.1.32:16443/api?timeout=30s: dial tcp 192.168.1.32:16443: connect: connection refused\n)" name=kubelet-setup FATA[0021] Cluster setup failed error="Command '/home/ondra/assets/opt/k8s-tew/bin/k8s/kubectl --request-timeout 30s --kubeconfig /home/ondra/assets/etc/k8s-tew/k8s/kubeconfig/admin.kubeconfig apply -f /home/ondra/assets/etc/k8s-tew/k8s/setup/kubelet-setup.yaml' failed with error 'exit status 1' (Output: unable to recognize \"/home/ondra/assets/etc/k8s-tew/k8s/setup/kubelet-setup.yaml\": Get https://192.168.1.32:16443/api?timeout=30s: dial tcp 192.168.1.32:16443: connect: connection refused\nunable to recognize \"/home/ondra/assets/etc/k8s-tew/k8s/setup/kubelet-setup.yaml\": Get https://192.168.1.32:16443/api?timeout=30s: dial tcp 192.168.1.32:16443: connect: connection refused\n)"

The log folder is unfortunately empty and the etcd output is below: ondra@webwork:~/assets/opt/k8s-tew/bin/etcd$ ./etcd 2018-09-21 23:08:41.648027 I | etcdmain: etcd Version: 3.3.9 2018-09-21 23:08:41.648215 I | etcdmain: Git SHA: fca8add78 2018-09-21 23:08:41.648233 I | etcdmain: Go Version: go1.10.3 2018-09-21 23:08:41.648260 I | etcdmain: Go OS/Arch: linux/amd64 2018-09-21 23:08:41.648275 I | etcdmain: setting maximum number of CPUs to 23, total number of available CPUs is 23 2018-09-21 23:08:41.648298 W | etcdmain: no data-dir provided, using default data-dir ./default.etcd 2018-09-21 23:08:41.662572 I | embed: listening for peers on http://localhost:2380 2018-09-21 23:08:41.662918 I | embed: listening for client requests on localhost:2379 2018-09-21 23:08:41.789449 I | etcdserver: name = default 2018-09-21 23:08:41.789535 I | etcdserver: data dir = default.etcd 2018-09-21 23:08:41.789554 I | etcdserver: member dir = default.etcd/member 2018-09-21 23:08:41.789567 I | etcdserver: heartbeat = 100ms 2018-09-21 23:08:41.789578 I | etcdserver: election = 1000ms 2018-09-21 23:08:41.789590 I | etcdserver: snapshot count = 100000 2018-09-21 23:08:41.789613 I | etcdserver: advertise client URLs = http://localhost:2379 2018-09-21 23:08:41.789628 I | etcdserver: initial advertise peer URLs = http://localhost:2380 2018-09-21 23:08:41.789708 I | etcdserver: initial cluster = default=http://localhost:2380 2018-09-21 23:08:41.982196 I | etcdserver: starting member 8e9e05c52164694d in cluster cdf818194e3a8c32 2018-09-21 23:08:41.982355 I | raft: 8e9e05c52164694d became follower at term 0 2018-09-21 23:08:41.982518 I | raft: newRaft 8e9e05c52164694d [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0] 2018-09-21 23:08:41.982557 I | raft: 8e9e05c52164694d became follower at term 1 2018-09-21 23:08:42.104308 W | auth: simple token is not cryptographically signed 2018-09-21 23:08:42.172609 I | etcdserver: starting server... [version: 3.3.9, cluster version: to_be_decided] 2018-09-21 23:08:42.175219 I | etcdserver: 8e9e05c52164694d as single-node; fast-forwarding 9 ticks (election ticks 10) 2018-09-21 23:08:42.181975 I | etcdserver/membership: added member 8e9e05c52164694d [http://localhost:2380] to cluster cdf818194e3a8c32 2018-09-21 23:08:42.585893 I | raft: 8e9e05c52164694d is starting a new election at term 1 2018-09-21 23:08:42.586002 I | raft: 8e9e05c52164694d became candidate at term 2 2018-09-21 23:08:42.586050 I | raft: 8e9e05c52164694d received MsgVoteResp from 8e9e05c52164694d at term 2 2018-09-21 23:08:42.586084 I | raft: 8e9e05c52164694d became leader at term 2 2018-09-21 23:08:42.586155 I | raft: raft.node: 8e9e05c52164694d elected leader 8e9e05c52164694d at term 2 2018-09-21 23:08:42.586715 I | etcdserver: published {Name:default ClientURLs:[http://localhost:2379]} to cluster cdf818194e3a8c32 2018-09-21 23:08:42.586954 I | embed: ready to serve client requests 2018-09-21 23:08:42.587249 I | etcdserver: setting up the initial cluster version to 3.3 2018-09-21 23:08:42.587409 E | etcdmain: forgot to set Type=notify in systemd service file? 2018-09-21 23:08:42.611478 N | etcdserver/membership: set the initial cluster version to 3.3 2018-09-21 23:08:42.632486 N | embed: serving insecure client requests on 127.0.0.1:2379, this is strongly discouraged! 2018-09-21 23:08:42.688852 I | etcdserver/api: enabled capabilities for version 3.3

Thanks for your feedback!

darxkies commented 6 years ago

Can you please check if the other files in assets/opt/... are all executable. I suspect that some of the files were not downloaded properly and thus can not be executed.

darxkies commented 6 years ago

How much RAM does the VM have? It needs at least 8GB.

darxkies commented 6 years ago

I think, I figured out why it did not work for you. You need to remove the assets folder and run the following commands as root:

k8s-tew initialize
k8s-tew configure --resolv-conf=/run/systemd/resolve/resolv.conf
k8s-tew configure --deployment-directory=$(pwd)/assets
k8s-tew configure --public-network=192.168.1.0/24
k8s-tew node-add -n webwork -i 192.168.1.32 -l bootstrapper,controller,worker
k8s-tew generate
k8s-tew run

zjevik commented 6 years ago

It seems that the only problem was that I forgot k8s-tew configure --deployment-directory=$(pwd)/assets. Running the commands you wrote under root seems to be working. I even added the line with deployment-directory to my original sequence and it seems fine. Well, this is what I get for not using the k8s-tew/setup/local script you made.

One last question, when k8s-tew run finishes it displays

INFO[1197] Cluster setup finished

And doesn't exit. It seems that I have to add k8s-tew run in the cron manually or is there another way to run k8s-tew as a service?

Thanks for your help!

darxkies commented 6 years ago

There is a new release (2.1.0-beta.4) that simplifies the local cluster setup. Take a look at setup/local/Makefile.

The local setup is actually mainly for development purposes. Easy to get started and also to tear it down once not needed anymore. What you really need is one of the other setups, where a bootstrapper machine is used to set up the nodes.

Never the less you can still use the local cluster setup that starts automatically when the VM boots up.

You need the following line executed before invoking any k8s-tew commands:

export K8S_TEW_BASE_DIRECOTRY=/

Instead of calling k8s-tew run use the following commands:

systemctl daemon-enable
systemctl enable k8s-tew
systemctl start k8s-tew

All this commands are executed as part of the deployment process. With journalctl -fu k8s-tew you can follow the output of k8s-tew.

Another solution would be to use the same virtual machine as bootstrapper and node. The files are copied over ssh (using k8s-tew deploy) to the same virtual machine and the systemctl commands above are not needed anymore. The node should be added without the -s parameter just like in setup/ubuntu-single-node/Makefile. Though, I haven't tried this approach but it should work.

zjevik commented 6 years ago

I see. Since I was deploying it just on one VM (to start with) I thought I could install it locally and it would be the same as if I used a different machine and deployed it from there.

Thanks for you help a lot!

darxkies / k8s-tew

error="current host not found in the list of nodes" when trying local installation #4