Open icymoon opened 6 years ago
We'll take a look.
Thanks for feedback@icymoon. We did find some bugs. And we will fix this bug at next version.
@icymoon , this bug didn't appear in my environment, but there are really some bugs here. Please change code like this or just use https://github.com/mscbg/dpvs.git: diff --git a/include/ipvs/dest.h b/include/ipvs/dest.h index 8576cbe..0d52b3e 100644 --- a/include/ipvs/dest.h +++ b/include/ipvs/dest.h @@ -75,6 +75,7 @@ struct dp_vs_dest { uint32_t vfwmark; / firewall mark of service / struct dp_vs_service svc; / service it belongs to / union inet_addr vaddr; / virtual IP address / + unsigned conn_timeout; / conn timeout copied from svc*/ } __rte_cache_aligned;
struct dp_vs_dest_conf { diff --git a/src/Makefile b/src/Makefile index 191d17f..3d7eb1f 100644 --- a/src/Makefile +++ b/src/Makefile @@ -19,7 +19,7 @@ # Makefile for dpvs (DPVS main program). #
-#DEBUG := 1 # enable for debug +DEBUG := 1 # enable for debug
TARGET := dpvs
diff --git a/src/ipvs/ip_vs_dest.c b/src/ipvs/ip_vs_dest.c index 0e5ab83..bfaeb83 100644 --- a/src/ipvs/ip_vs_dest.c +++ b/src/ipvs/ip_vs_dest.c @@ -223,6 +223,7 @@ int dp_vs_new_dest(struct dp_vs_service svc, struct dp_vs_dest_conf udest, dest->proto = svc->proto; dest->vaddr = svc->addr; dest->vport = svc->port; + dest->conn_timeout = svc->conn_timeout; dest->vfwmark = svc->fwmark; dest->addr = udest->addr; dest->port = udest->port; diff --git a/src/ipvs/ip_vs_service.c b/src/ipvs/ip_vs_service.c index 8d249d0..39e4139 100644 --- a/src/ipvs/ip_vs_service.c +++ b/src/ipvs/ip_vs_service.c @@ -664,10 +664,8 @@ out: unsigned dp_vs_get_conn_timeout(struct dp_vs_conn *conn) { unsigned conn_timeout; - if (conn->dest->svc) { - rte_atomic32_inc(&(conn->dest->svc->usecnt)); - conn_timeout = conn->dest->svc->conn_timeout; - rte_atomic32_dec(&(conn->dest->svc->usecnt)); + if (conn->dest) { + conn_timeout = conn->dest->conn_timeout; return conn_timeout; } return 90;
if crash happened again, please show me the coredump file by gdb(open DEBUG in Makefile)
Updated to the latest version. New problem, many connections don't release after long large pressure.
Debug is enabled: CFLAGS += -D CONFIG_RECORD_BIG_LOOP CFLAGS += -D CONFIG_DPVS_SAPOOL_DEBUG CFLAGS += -D CONFIG_DPVS_IPVS_DEBUG CFLAGS += -D CONFIG_SYNPROXY_DEBUG CFLAGS += -D CONFIG_TIMER_MEASURE
# ./ipvsadm -ln && sleep 300 && ./ipvsadm -ln IP Virtual Server version 0.0.0 (size=0) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn MATCH icmp,from=172.16.0.0-172.16.255.254:0-0,oif=dpdk1 rr -> 192.168.10.2:0 SNAT 100 0 39409 -> 192.168.10.4:0 SNAT 100 0 43639 -> 192.168.10.5:0 SNAT 100 0 56142 -> 192.168.10.6:0 SNAT 100 0 918881 MATCH tcp,from=172.16.0.0-172.16.255.254:0-0,oif=dpdk1 rr -> 192.168.10.2:0 SNAT 100 1 6862 -> 192.168.10.4:0 SNAT 100 1 6885 -> 192.168.10.5:0 SNAT 100 2 6997 -> 192.168.10.6:0 SNAT 100 4 7420 MATCH udp,from=172.16.0.0-172.16.255.254:0-0,oif=dpdk1 rr -> 192.168.10.2:0 SNAT 100 0 64074 -> 192.168.10.4:0 SNAT 100 0 64065 -> 192.168.10.5:0 SNAT 100 0 64069 -> 192.168.10.6:0 SNAT 100 0 64019 IP Virtual Server version 0.0.0 (size=0) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn MATCH icmp,from=172.16.0.0-172.16.255.254:0-0,oif=dpdk1 rr -> 192.168.10.2:0 SNAT 100 0 39409 -> 192.168.10.4:0 SNAT 100 0 43639 -> 192.168.10.5:0 SNAT 100 0 56142 -> 192.168.10.6:0 SNAT 100 0 918881 MATCH tcp,from=172.16.0.0-172.16.255.254:0-0,oif=dpdk1 rr -> 192.168.10.2:0 SNAT 100 1 6862 -> 192.168.10.4:0 SNAT 100 1 6885 -> 192.168.10.5:0 SNAT 100 2 6997 -> 192.168.10.6:0 SNAT 100 4 7420 MATCH udp,from=172.16.0.0-172.16.255.254:0-0,oif=dpdk1 rr -> 192.168.10.2:0 SNAT 100 0 64074 -> 192.168.10.4:0 SNAT 100 0 64065 -> 192.168.10.5:0 SNAT 100 0 64069 -> 192.168.10.6:0 SNAT 100 0 64019
Did you see any logs still produced after pressure stoped? such as ‘connection is busy: conn->refcnt = XXX’,‘connection not hashed’
I have run a script with pressure for a whole night but the connections are all released. Script is just like this: 1 #!/bin/sh 2 while true 3 do 4 ipvsadm -ln 5 ipvsadm -C 6 ipvsadm -A -s rr -H proto=tcp,src-range=X-X,oif=dpdk0 7 ipvsadm -a -H proto=tcp,src-range=X-X,oif=dpdk0 -r X:0 -w 100 -J 8 done
And I have also test fullnat mode by restart keepalived/clear and set ipvsadm, connections realease after traffic stop.So, can you just use tcpdump in client to find whether there are really some connction still there? And which tool did you use for making traffic?
flood ping, curl for http, dig for dns look up
And many kinds of tcp/udp/icmp packets built and sent by scapy script:
1)icmp echo-request & echo reply
2) TCP syn, synack, ack, fin, finack, rst...
3) UDP..
with a large src/dst IP/port range
It can't be reproduced now. I will run a test over this week end.
Thank you.
This issue should be closed now I think. I tested it over a weekend and no core file generated. Thanks.
No session leak find last night, thank you.
flood ping, curl for http, dig for dns look up And many kinds of tcp/udp/icmp packets built and sent by scapy script: 1)icmp echo-request & echo reply 2) TCP syn, synack, ack, fin, finack, rst... 3) UDP.. with a large src/dst IP/port range
Excuse me, may i have a look at your test scripts? I'm new to dpvs & scapy, hoping to learn from the expert... Thanks a lot.
dpvs rules and pressure like this:
At the same time,run ipvsadm commands:
After about several hours,dpvs crashed.