Closed lmb closed 5 years ago
I've added the following patch on top of #49 :
diff --git a/src/glb-redirect/ipt_GLBREDIRECT.c b/src/glb-redirect/ipt_GLBREDIRECT.c
index 62b9cf6..8493c07 100644
--- a/src/glb-redirect/ipt_GLBREDIRECT.c
+++ b/src/glb-redirect/ipt_GLBREDIRECT.c
@@ -41,7 +41,7 @@
#define MAX_HOPS 256
-#ifdef DEBUG
+#if 1
#define PRINT_DEBUG(...) printk(__VA_ARGS__)
/* like WARN_ON(), except returns NF_DROP as well */
#define DROP_ON(condition) do { \
@@ -509,6 +509,7 @@ static unsigned int is_valid_locally(struct net *net, struct sk_buff *skb, int i
}
if (nsk) {
+ PRINT_DEBUG(KERN_ERR "established with state %d (TCP_NEW_SYN_RECV = %d)\n", nsk->sk_state, TCP_NEW_SYN_RECV);
sock_put(nsk);
return 1;
}
Running TestGLBRedirectModuleV4OnV4.test_04_icmp_packet_too_big
I get the following output on proxy2
:
[14595.400949] IP<0532a8c0,0b32a8c0> GUE<> IPv4<0532a8c0,0a0a0a0a> TCP<49018,22 flags ..R..>
[14595.401971] -> no more alternative hops available, accept here regardless
[14595.750716] IP<0532a8c0,0b32a8c0> GUE<> IPv4<0532a8c0,0a0a0a0a> TCP<49018,22 flags S....>
[14595.751677] -> SYN packet, accepting locally
[14595.878116] IP<0a32a8c0,0b32a8c0> GUE<> IPv4<> ICMPv4<> IPv4<0532a8c0,0a0a0a0a> TCP<49018,22 flags .....>
[14595.879217] -> checking for local matching connection
[14595.879700] -> checking for established
[14595.880129] established with state 12 (TCP_NEW_SYN_RECV = 12)
[14595.880620] -> matched local flow, accepting
I think this validates my theory, at least for kernels > 4.9.0.
With #52 merged I took some time to look at the metrics from our deployment. Specifically I looked at accepted_conntracked_packets
and accepted_syn_ackets
. SYN packets make up roughly 4% of all traffic. Packets accepted due to conntrack are 0 most of the time. If conntrack was still required for the 3WHS ACK I'd expect these numbers to be roughly the same.
I said that conntrack are 0 most of the time: there are blips from time to time where 0.001% of traffic are accepted due to conntrack. I need to track down what is happening here, but I'm convinced that we don't need conntrack.
This is a follow up to https://github.com/github/glb-director/issues/44#issuecomment-434981307
I dove into the 4.19 sources, and found that new connections are created in tcp_conn_request:
inet_reqsk_alloc
which creates a newstruct request_sock
with sk_state equal toTCP_NEW_SYN_RECV
inet_csk_reqsk_queue_hash_add
which ends up adding thestruct request_sock
intotcp_hashinfo
This would mean my conjecture in the original comment is almost correct.
inet_lookup_established
does return these sockets where SYNACK is outstanding, but in state TCP_NEW_SYN_RECV.I think I've found the place where the
request_sock
is turned into a full socket: tcp_v4_rcv.Maybe the whole conntrack part of the code is not required anymore?