gravitl / netmaker

Netmaker makes networks with WireGuard. Netmaker automates fast, secure, and distributed virtual networks.
https://netmaker.io
Other
9.5k stars 552 forks source link

'no such network interface' - Route Networking issue during installation #494

Closed cmpatel closed 2 years ago

cmpatel commented 2 years ago

Setup

  1. MasterNode - Azure Ubuntu 20.04 LTS
  2. Worker Node1 - AWS EC2 Ubuntu 20.04 LTS
  3. Worker Node2 - AWS EC2 Ubuntu 20.04 LTS
  4. NetMaker Node- AWS EC2 Ubuntu 20.04 LTS

Following the guide from here: https://itnext.io/how-to-deploy-a-single-kubernetes-cluster-across-multiple-clouds-using-k3s-and-wireguard-a5ae176a6e81

Part 1 - Successful installation of Netmaker and able to reach Netmaker UI and see all the nodes Part 2 - Installing K3s command on Master node first

  1. On Master Node -'ip a' output lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 00:22:48:2d:ad:6e brd ff:ff:ff:ff:ff:ff inet 10.0.0.4/24 brd 10.0.0.255 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::222:48ff:fe2d:ad6e/64 scope link valid_lft forever preferred_lft forever 3: enP36360s1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master eth0 state UP group default qlen 1000 link/ether 00:22:48:2d:ad:6e brd ff:ff:ff:ff:ff:ff altname enP36360p0s2 6: mpqemubr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000 link/ether 52:54:00:b1:76:7a brd ff:ff:ff:ff:ff:ff inet 10.243.24.1/24 brd 10.243.24.255 scope global mpqemubr0 valid_lft forever preferred_lft forever 10: nm-default: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1280 qdisc noqueue state UNKNOWN group default qlen 1000 link/none inet 10.101.0.2/24 scope global nm-default valid_lft forever preferred_lft forever 11: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default link/ether 02:42:33:ae:eb:45 brd ff:ff:ff:ff:ff:ff inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0 valid_lft forever preferred_lft forever

  2. Output of 'wg-show' interface: nm-default public key: g46wXs9RzLyZsVGBEEiXIQQWakhuGYqL6rDpKQ/QnV4= private key: (hidden) listening port: 60978

peer: usCAM4n1auayL1JaQD5n/6pDUwHtKhRpjNpYNqHNYDQ= endpoint: xxx.xxx.xxx.xx:51821 allowed ips: 10.101.0.1/32 transfer: 0 B received, 1.80 MiB sent persistent keepalive: every 20 seconds

peer: sLABl+sWuRC0P2oV46buikSU7BPBfwaKAFau7bt3zwM= endpoint: xxx.xxx.xxx.xx:51821 allowed ips: 10.101.0.3/32 transfer: 0 B received, 1.80 MiB sent persistent keepalive: every 20 seconds

peer: o70Wjxk66AUgEKhneAFzQh89av357mur3+/9QTaGQ2s= endpoint: xxx.xxx.xxx.xx:51821 allowed ips: 10.101.0.4/32 transfer: 0 B received, 1.80 MiB sent persistent keepalive: every 20 seconds

  1. output of 'systemctl status k3s' k3s.service - Lightweight Kubernetes Loaded: loaded (/etc/systemd/system/k3s.service; enabled; vendor preset: enabled) Active: active (running) since Fri 2021-11-19 09:26:58 UTC; 1 day 18h ago Docs: https://k3s.io Main PID: 17098 (k3s-server) Tasks: 20 Memory: 581.6M CGroup: /system.slice/k3s.service └─17098 /usr/local/bin/k3s server

Nov 21 03:38:55 NetMakerMaster k3s[17098]: time="2021-11-21T03:38:55.166223007Z" level=info msg="Waiting for control-plane node agent startup" Nov 21 03:38:56 NetMakerMaster k3s[17098]: time="2021-11-21T03:38:56.166464363Z" level=info msg="Waiting for control-plane node agent startup" Nov 21 03:38:57 NetMakerMaster k3s[17098]: time="2021-11-21T03:38:57.167113444Z" level=info msg="Waiting for control-plane node agent startup" Nov 21 03:38:58 NetMakerMaster k3s[17098]: time="2021-11-21T03:38:58.167567911Z" level=info msg="Waiting for control-plane node agent startup" Nov 21 03:38:59 NetMakerMaster k3s[17098]: time="2021-11-21T03:38:59.168027279Z" level=info msg="Waiting for control-plane node agent startup" Nov 21 03:38:59 NetMakerMaster k3s[17098]: time="2021-11-21T03:38:59.562104367Z" level=info msg="Cluster-Http-Server 2021/11/21 03:38:59 http: TLS handshake error from 127.0.0.1:51254: remote error: tls:> Nov 21 03:38:59 NetMakerMaster k3s[17098]: time="2021-11-21T03:38:59.568690998Z" level=error msg="Failed to configure agent: unable to find interface: route ip+net: no such network interface" Nov 21 03:39:00 NetMakerMaster k3s[17098]: time="2021-11-21T03:39:00.168450245Z" level=info msg="Waiting for control-plane node agent startup" Nov 21 03:39:01 NetMakerMaster k3s[17098]: time="2021-11-21T03:39:01.169041621Z" level=info msg="Waiting for control-plane node agent startup" Nov 21 03:39:02 NetMakerMaster k3s[17098]: time="2021-11-21T03:39:02.169633798Z" level=info msg="Waiting for control-plane node agent startup"

  1. Reaching out to other nodes via Ping ping -I 10.101.0.2 10.101.0.1 PING 10.101.0.1 (10.101.0.1) from 10.101.0.2 : 56(84) bytes of data. ^C --- 10.101.0.1 ping statistics --- 10 packets transmitted, 0 received, 100% packet loss, time 9209ms

  2. kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system helm-install-traefik-cct85 0/1 Pending 0 42h kube-system helm-install-traefik-crd-zb784 0/1 Pending 0 42h kube-system metrics-server-86cbb8457f-c6jrb 0/1 Pending 0 42h kube-system local-path-provisioner-5ff76fc89d-dlpl7 0/1 Pending 0 42h kube-system coredns-7448499f4d-rb4hm 0/1 Pending 0 42h

  3. Issues//Question

    1. How to get over 'Failed to configure agent' error?
    2. How to ensure connectivity with other nodes via wireguard nodes?
    3. How to ensure all the pods are running again?

Would you please advise on how to fix this issue or if I am missing any instructions?

afeiszli commented 2 years ago

On your "wg show", it displays "transfer: 0 B received, 1.80 MiB sent" on all nodes. This means there was no connection established. Did you confirm there was a connection before installing k3s? That's the first step.

afeiszli commented 2 years ago

looking for follow up here or will be closing the issue

afeiszli commented 2 years ago

closing due to lack of response