gravitl / netmaker

Netmaker makes networks with WireGuard. Netmaker automates fast, secure, and distributed virtual networks.
https://netmaker.io
Other
9.4k stars 547 forks source link

[Bug]: All peers unable to connect to netmaker-1 default node #1705

Open PyMichaelB opened 1 year ago

PyMichaelB commented 1 year ago

Contact Details

No response

What happened?

Hi, I am having an issue where all my peers (10.20.30.1/24) are able to handshake and ping each other successfully except for the default netmaker-1 node.

I deployed via the Quick Install guide.

My netmaker server is in an AWS EC2 instance and I have enabled the security rules: image

I have the ufw enabled too and it allows the same ports and protocols. I exec'd into the netmaker container and saw that none of the peers in the WireGuard config (wg) had endpoints set - just public key and allowed IPs - is this normal?

My network setup is simple with a straight forward mesh network - no ingress / egress etc. I have UDP hole punching enabled as some of my nodes are behind my home router which uses NAT - I don't know too much about how this works however.

As said above, all my other peers are handshaking frequently (including others in AWS EC2 instances with the same firewall rules) and I can communicate successfully between them using their netmaker IP addresses.

Version

v0.16.1

What OS are you using?

Linux Ubuntu 22.04 Server

Relevant log output

--- netmaker ---
root@**********:~# docker logs netmaker

 __   __     ______     ______   __    __     ______     __  __     ______     ______    
/\ "-.\ \   /\  ___\   /\__  _\ /\ "-./  \   /\  __ \   /\ \/ /    /\  ___\   /\  == \   
\ \ \-.  \  \ \  __\   \/_/\ \/ \ \ \-./\ \  \ \  __ \  \ \  _"-.  \ \  __\   \ \  __<   
 \ \_\\"\_\  \ \_____\    \ \_\  \ \_\ \ \_\  \ \_\ \_\  \ \_\ \_\  \ \_____\  \ \_\ \_\ 
  \/_/ \/_/   \/_____/     \/_/   \/_/  \/_/   \/_/\/_/   \/_/\/_/   \/_____/   \/_/ /_/ 

[netmaker] 2022-11-01 11:20:48 connecting to sqlite 
[netmaker] 2022-11-01 11:20:48 database successfully connected 
[netmaker] 2022-11-01 11:20:48 no OAuth provider found or not configured, continuing without OAuth 
[netmaker] 2022-11-01 11:20:53 MQ Is Already Configured, Skipping... 
[netmaker] 2022-11-01 11:20:53 REST Server successfully started on port  8081  (REST) 
[netmaker] 2022-11-01 11:20:53 connecting to mq broker at mq:1883 with TLS? false

--- docker ps ---
root@**********:~# docker ps
CONTAINER ID   IMAGE                              COMMAND                  CREATED              STATUS              PORTS                                                                             NAMES
855000d2af0c   gravitl/netmaker-ui:v0.16.1        "/docker-entrypoint.…"   About a minute ago   Up About a minute   80/tcp                                                                            netmaker-ui
70260ebfa9de   coredns/coredns                    "/coredns -conf /roo…"   About a minute ago   Up About a minute   53/tcp, 53/udp                                                                    coredns
f02c9a1a3966   eclipse-mosquitto:2.0.11-openssl   "/docker-entrypoint.…"   About a minute ago   Up About a minute   1883/tcp, 8883/tcp                                                                mq
13961901e982   traefik:v2.6                       "/entrypoint.sh --ce…"   About a minute ago   Up About a minute   80/tcp, 0.0.0.0:443->443/tcp, :::443->443/tcp                                     traefik
c171109ff60a   gravitl/netmaker:v0.16.1           "./netmaker"             About a minute ago   Up About a minute   8081/tcp, 0.0.0.0:51821-51830->51821-51830/udp, :::51821-51830->51821-51830/udp   netmaker

Contributing guidelines

mattkasun commented 1 year ago

do the firewalls permit icmp?

PyMichaelB commented 1 year ago

do the firewalls permit icmp?

Hi Matt, I've tested with ICMP allowed and I can ping the public IP of my cloud instance but cannot ping the netmaker assigned IP

mattkasun commented 1 year ago

Are you saying that the public endpoint of netmaker-1 node is different than the public ip of your server?

PyMichaelB commented 1 year ago

The server which runs netmaker, the MQ, traefik containers etc. has an accessible public IP.

Within the netmaker container, there is a netmaker-1 wireguard interface associated with my netmaker network (setup in the netmaker UI) and I am unable to ping it's IP address (10.20.30.254) from other servers running netclient joined to the same netmaker network e.g. from 10.20.30.2.

mattkasun commented 1 year ago

Is the endpoint of the netmaker-1 node the same as your server's public ip? image

PyMichaelB commented 1 year ago

Is the endpoint of the netmaker-1 node the same as your server's public ip? image

Ah yes I understand. It is the same

mattkasun commented 1 year ago

can you test if udp connection is working or not: on another computer run: netcat -u <server ip> <listen port>

hit enter do you get read(net): Connection refused or does cursor just move to next line? (if latter, cntrl-C to exit)

PyMichaelB commented 1 year ago

The cursor just moves to the next line

mattkasun commented 1 year ago

The cursor just moves to the next line unfortunately

Actually that is good. that means that the netmaker-1 endpoint is reachable over udp.

What value did you set the persistentkeepalive?

PyMichaelB commented 1 year ago

Well trying netcat again with the server down I see the same result so I'm not sure that does prove its reachable over udp? Persistent keep alive is 20 seconds. With the ufw temporarily disabled I see the same issue so I think its not related to that.

I can see the docker ports open for udp on 51821 - 51830 (lsof -i -P -n).

afeiszli commented 1 year ago

To confirm it's not an MTU issue, do you see a handshake with the Netmaker node in "wg show"?

mattkasun commented 1 year ago

Well trying netcat again with the server down I see the same result

if the server is down and netcat gets a response indicates that something else is listening on that ip/port. Is your serve public ip a load balancer?

PyMichaelB commented 1 year ago

To confirm it's not an MTU issue, do you see a handshake with the Netmaker node in "wg show"?

No I don't see handshakes

PyMichaelB commented 1 year ago

Well trying netcat again with the server down I see the same result

if the server is down and netcat gets a response indicates that something else is listening on that ip/port. Is your serve public ip a load balancer?

Ok trying again when I type "netcat -u < server IP > 51822" and hit enter twice the connection is ended.

I tried changing the setting "dynamic port" to OFF (only for the netmaker-1 node) and now the connection is available (on the same port) and I can ping from other nodes.