newsnowlabs / docker-ingress-routing-daemon

Docker swarm daemon that modifies ingress mesh routing to expose true client IPs to service containers
MIT License
189 stars 37 forks source link

Error: Nexthop has invalid gateway. #32

Closed josimarjmv closed 1 year ago

josimarjmv commented 1 year ago

Hello Team,

First and foremost, I'd like to thank you for this solution. It has the potential to address a significant challenge I've been facing. However, when trying to implement it, I encountered an error that I'm struggling to understand. I've checked the network configurations, and everything seems to be in order.

My question is: Do I need to use the host network in swarm for this script to work correctly? Below is the error I'm receiving:

Error: Nexthop has invalid gateway

I would greatly appreciate any guidance or assistance you can provide on this matter. Best regards

struanb commented 1 year ago

Hi @josimarmachado. Thanks for trying DIRD.

No, you should not use host networking with DIRD. You should publish ports for your public-facing Docker services in the usual way.

Assuming the issue is repeatable for you, please provide the command line you're launching DIRD with, the command for launching one of your Docker services, and the DIRD logs.

josimarjmv commented 1 year ago

Hi @struanb, I would like to thank you for your initiative in developing this solution and also for your attention in responding to me :)

I use ubuntu 20 and haproxy with docker swarm, try just service for 0, and next 1 replica. or remove container, all cases returned error.

I try this command:

./docker-ingress-routing-daemon --install --ingress-gateway-ips my-public-ip --services sandbox_cdn_haproxy --tcp-ports 1925

and

./docker-ingress-routing-daemon --install --ingress-gateway-ips my-public-ip --services sandbox_cdn_haproxy

ah in all cases, log is:

root@sandbox:/home/josimar/docker-ingress-routing-daemon# ./docker-ingress-routing-daemon --install --ingress-gateway-ips [mypublicip] --services sandbox_cdn_haproxy
2023-08-20.18:34:22.113409|sandbox|665881| Docker Ingress Routing Daemon 4.1.1 starting ...
2023-08-20.18:34:22.158263|sandbox|665881| Detecting ingress network and node IP:
2023-08-20.18:34:22.164065|sandbox|665881| - Ingress subnet: 10.0.0.0/24
2023-08-20.18:34:22.169687|sandbox|665881| - This node's IP: 10.0.0.4
2023-08-20.18:34:22.176565|sandbox|665881| Cleaning up any stale load-balancer rules ...
2023-08-20.18:34:22.200485|sandbox|665881| Enumerating load balancers from --ingress-gateway-ips [mypublicip]
2023-08-20.18:34:22.213240|sandbox|665881| - Load balancer [mypublicip] will have ID 137
2023-08-20.18:34:22.221050|sandbox|665881| This node is not a specified load balancer; so skipping installing ingress namespace iptables rules
2023-08-20.18:34:22.227976|sandbox|665881| Setting ingress_sbox namespace sysctl variables:
2023-08-20.18:34:22.234846|sandbox|665881| - Setting net.ipv4.vs.conn_reuse_mode=0 net.ipv4.vs.expire_nodest_conn=1 net.ipv4.vs.expire_quiescent_template=1
2023-08-20.18:34:22.245200|sandbox|665881| Launching docker event watcher to monitor for container launches (pgroup 665881) ...
2023-08-20.18:35:54.123885|sandbox|665881| Detected container launch for service 'sandbox_cdn_haproxy', with ID 'a389f2877d44e8a4c9df4fcd4032a68b04f9a73171145afbacc12f39a8be3528' and NID '670580': ingress network interface eth0 found, so applying policy routing/firewall rules:
2023-08-20.18:35:54.131276|sandbox|665881| - Adding container mangle table iptables rules
2023-08-20.18:35:54.151173|sandbox|665881| - Setting container sysctl net.ipv4.conf.all.rp_filter=2 net.ipv4.conf.eth0.rp_filter=2
2023-08-20.18:35:54.164820|sandbox|665881| - Adding container policy routing/firewall rules for load-balancer #137 with IP [mypublicip]
Error: Nexthop has invalid gateway.
2023-08-20.18:35:54.187309|sandbox|665881| - Finished configuring launched container

I observe this line: sysctl net.ipv4.conf.all.rp_filter=2 net.ipv4.conf.eth0.rp_filter=2 in container, network is eth0 ok. but, in my server, not os this name, is other name.

I don't know if that would affect it. another thing that I tested is that, inside the container, there are network using my public ip.

my haproxy compose.yml expose port 1925. and my haproxy using global bind *:1925

struanb commented 1 year ago

You're masking my-public-ip in your comment, which suggests a problem with how you're using DIRD.

The --ingress-gateway-ips IP list should be the (comma separated) ingress network IPs of those Docker swarm nodes receiving incoming public Internet traffic; not the actual public IPs of those nodes.

So for the node from which you've shared logs, this value should be 10.0.0.4 (as outputted in the logs).

(That will be enough if that's the only node receiving public traffic; but if you have other nodes receiving traffic you must include their ingress network IPs too; or indeed list the ingress IPs of all your nodes! In any case, all IPs for this value must be from this range).

Once you have identified the correct value for --ingress-gateway-ips you must also be sure to run the same command on each and every node in your swarm.

I hope that all makes sense and allows you to get DIRD working.

josimarjmv commented 1 year ago

wow, I don't know if I'm sad or happy. happy because it's working perfectly. or sad, because I wasted your time, with something so stupid. thank you so much.

struanb commented 1 year ago

Ha, don’t worry, you’re welcome.

Glad you now have it working.