newsnowlabs / docker-ingress-routing-daemon

Docker swarm daemon that modifies ingress mesh routing to expose true client IPs to service containers
MIT License
179 stars 34 forks source link

Locked by xtables.lock #6

Closed salexer closed 3 years ago

salexer commented 3 years ago

Hi @struanb .

I`m trying to run your script. But it doesn't work.

./docker-ingress-routing-daemon --install --ingress-gateway-ips 10.0.0.2 --services dev_gate
2021-04-01.21:50:56.171823|dev|1715414| Docker Ingress Routing Daemon 3.1.0 starting ...
2021-04-01.21:50:56.217189|dev|1715414| Detected ingress subnet: 10.0.0.0/24
2021-04-01.21:50:56.222036|dev|1715414| This node's ingress network IP: 10.0.0.2
2021-04-01.21:50:56.238838|dev|1715414| Running with --ingress-gateway-ips 10.0.0.2
2021-04-01.21:50:56.245795|dev|1715414| This node's ID is: 2
2021-04-01.21:50:56.250709|dev|1715414| Adding ingress_sbox iptables nat rule: iptables -t nat -I POSTROUTING -d 10.0.0.0/24 -m ipvs --ipvs -j ACCEPT
2021-04-01.21:50:56.283621|dev|1715414| Adding ingress_sbox iptables mangle rule: iptables -t mangle -A POSTROUTING -d 10.0.0.0/24 -j TOS --set-tos 2/0xff
2021-04-01.21:50:56.315362|dev|1715414| Adding ingress_sbox connection tracking disable rule: iptables -t raw -I PREROUTING -j CT --notrack
2021-04-01.21:50:56.355834|dev|1715414| Setting ingress_sbox namespace sysctl variables net.ipv4.vs.conn_reuse_mode=0 net.ipv4.vs.expire_nodest_conn=1 net.ipv4.vs.expire_quiescent_template=1
net.ipv4.vs.conn_reuse_mode = 0
net.ipv4.vs.expire_nodest_conn = 1
net.ipv4.vs.expire_quiescent_template = 1
2021-04-01.21:50:56.363160|dev|1715414| Setting ingress_sbox namespace sysctl conntrack variables from /etc/sysctl.d/conntrack.conf
2021-04-01.21:50:56.367601|dev|1715414| Setting ingress_sbox namespace sysctl ipvs variables from /etc/sysctl.d/ipvs.conf
2021-04-01.21:50:56.371820|dev|1715414| Docker Ingress Routing Daemon launching docker event watcher in pgroup 1715414 ...
2021-04-01.21:51:24.497593|dev|1715414| Container SERVICE=dev_gate, ID=eb8796258552d77066dd4acb5765f4e5b7242a96c5a35bfaba8e05effc0fb251, NID=1715828 launched: ingress network interface eth1 found, so applying policy routes.
Another app is currently holding the xtables lock. Perhaps you want to use the -w option?
net.ipv4.conf.all.rp_filter = 2
net.ipv4.conf.eth1.rp_filter = 2
2021-04-01.21:57:53.416856|dev|1715414| Container SERVICE=dev_gate, ID=c6e55c6072e92af156a40d7ae1b8d2e724957877374d55499d95ed70fe682ccf, NID=1719661 launched: ingress network interface eth1 found, so applying policy routes.
Another app is currently holding the xtables lock. Perhaps you want to use the -w option?
net.ipv4.conf.all.rp_filter = 2
net.ipv4.conf.eth1.rp_filter = 2

before 2021-04-01.21:51:24.497593 and 2021-04-01.21:57:53.416856 I changed sacle service from 0 to 1...

Ubuntu 20.04.1 LTS Docker version 19.03.13, build 4484c46d9d

Thank you!

struanb commented 3 years ago

Hi @salexer. Thanks for your patience in me coming back to you.

I haven't experienced this issue myself, but the message - "Another app is currently holding the xtables lock" - suggests precisely that you have another process running that is holding the xtables lock. iptables needs to obtain this lock in order to modify iptables (xtables) rules, and docker-ingress-routing-daemon needs to call iptables - hence the problem.

Can you first check you still get this error message, and if so, then inspect your process table to find out what process could be holding the xtables lock? If in doubt, please share any candidate processes, or a full process list. I am not familiar with what processes do this, but wonder if some kind of firewall rule management daemon may be doing this.

N.B. I don't believe adding the -w option to iptables would help, as I suspect it would just cause iptables to block.

Jasonlxl commented 3 years ago

Dear struanb, I tried to follow the instructions you provided in the Docker Swarm cluster. However, the expected results were not obtained.

Our cluster has 7 nodes and the ingress network is located at 10.255.0.0/16. The service we expect to obtain the real IP is the multiple copies of Nginx deployed in the cluster. I reduced the number of container copies of the Nginx service to 0. Then docker-ingress-routing-daemon --ingress-gateway-ips <Node Ingress IP List> --install was run on each node, and then the number of container copies of the Nginx service was expanded to the original number.

Such operations cause all the exposed ports of the services deployed in the cluster to be unlinkable, for example, Portainer's port 9000 cannot be accessed.

According to the minimum principle, we chose node A (single node) as the node for load balancing node and service replica deployment. Follow your instructions, just run docker-ingress-routing-daemon --ingress-gateway-ips <Node Ingress IP List> --install on node A. This time, the port of the Nginx service can be linked, and the log shows the real IP, but the reverse proxy rules configured in Nginx are all abnormal. The Nginx http log shows that the return code of the request from the reverse proxy to other ports in the cluster is 499. At this time, other services deployed on the node still have port request failures, but the node service of the docker-ingress-routing-daemon --ingress-gateway-ips <Node Ingress IP List> --install command is normal without running docker-ingress-routing-daemon --ingress-gateway-ips <Node Ingress IP List> --install . For example, requesting port 9000 of node A fails, but requesting port 9000 of other nodes can access Portainer normally.

Have we misunderstood the usage method you provided? Or is there something wrong with our operation? We are very eager to use your daemon,Thank you!

struanb commented 3 years ago

Hi @Jasonlxl. Thanks for trying DIRD. Would you mind copying your question into a separate issue? I think this is unrelated to xtables.lock. I will consider your question in the meantime, and will reply as soon as you have created the new issue.

Jasonlxl commented 3 years ago

Hi @Jasonlxl. Thanks for trying DIRD. Would you mind copying your question into a separate issue? I think this is unrelated to xtables.lock. I will consider your question in the meantime, and will reply as soon as you have created the new issue.

Alright,I've created a new issue. Thank you!

struanb commented 3 years ago

Since the original error message - "Another app is currently holding the xtables lock" - suggests that there exists another process running holding the xtables lock, and this doesn't appear to be an issue with DIRD, I am closing this issue.