How to be compatible with the original service

Jasonlxl commented 3 years ago

Dear struanb, I tried to follow the instructions you provided in the Docker Swarm cluster. However, the expected results were not obtained.

Our cluster has 7 nodes and the ingress network is located at 10.255.0.0/16. The service we expect to obtain the real IP is the multiple copies of Nginx deployed in the cluster. I reduced the number of container copies of the Nginx service to 0. Then docker-ingress-routing-daemon --ingress-gateway-ips <Node Ingress IP List> --install was run on each node, and then the number of container copies of the Nginx service was expanded to the original number.

Such operations cause all the exposed ports of the services deployed in the cluster to be unlinkable, for example, Portainer's port 9000 cannot be accessed.

According to the minimum principle, we chose node A (single node) as the node for load balancing node and service replica deployment. Follow your instructions, just run docker-ingress-routing-daemon --ingress-gateway-ips <Node Ingress IP List> --install on node A. This time, the port of the Nginx service can be linked, and the log shows the real IP, but the reverse proxy rules configured in Nginx are all abnormal. The Nginx http log shows that the return code of the request from the reverse proxy to other ports in the cluster is 499. At this time, other services deployed on the node still have port request failures, but the node services without running docker-ingress-routing-daemon --ingress-gateway-ips <Node Ingress IP List> --install command are normal. For example, requesting port 9000 of node A fails, but requesting port 9000 of other nodes can access Portainer normally.

Have we misunderstood the usage method you provided? Or is there something wrong with our operation? We are very eager to use your daemon，Thank you!

SUSE12SP3 Kernel Version 4.4.82-6.3-default Docker version 18.09.9

struanb commented 3 years ago

Hi @Jasonlxl. Thanks for posting as new issue. I've been giving this some thought.

The first thing that strikes me is your ingress network. As the README states:

As, currently, the unique NODE_ID is determined from the load-balancer node's ingress network IP, the ingress network cannot be larger than a /24.

As such, with a /16 network the DIRD approach will immediately fail if two nodes have the same final IP address byte on this network. It might possibly work if the final IP address bytes are all different, which should always be true if you are testing running the load balancer and the service container on the same node (your node A), though N.B. a /16 network is an unsupported configuration which I have not tested.

Secondly, if you have multiple services running and you only want DIRD to apply to some of them, you must run with --services and --tcp-ports (using corresponding tcp ports for the listed services). Otherwise the DIRD will intercept containers launched for all services, and will intercept traffic for all service tcp ports. Can you re-run with these options on your single node A?

Jasonlxl commented 3 years ago

Thanks, @struanb . It all works like a charm! I can't find words to describe how grateful we are for your solution and solving our problem.

According to your prompt, add --tcp-ports <ports> to the command to get the real IP on the port where the real IP must be used, and it will not affect the existing services.

The case of subnet mask /16 does need to be noted. Currently, no error will be generated if the final IP address byte is different.

Thank you for your detailed analysis and plan, thank you very much! I think we can safely close this issue.

newsnowlabs / docker-ingress-routing-daemon

How to be compatible with the original service #7