Closed heimdull closed 3 years ago
I think that configuring the proxy to skip protocol detection for the ports used by memcached should also solve this issue. We may want to add port 11211 (the memcached registered port) to the list of ports that skip protocol detection by default, in case persistent connections are being used?
This might get fixed as part of the upcoming TCP mTLS and server-speak-first work. Let’s check back in then!
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.
Bug Report
What is the issue?
We have a few clusters running with linkerd 2.7.1 where we tried to upgrade but every version after 2.7.1 drops our memcached connections. We have tomcat containers that connect to memcached servers outside of the kubernetes/linkerd cluster and after the upgrade these connections are dropped. Rolling back to 2.7.1 resolves the issue.
We tested upgrades to all available versions after 2.7.1 and they all experience the same dropped connection.
How can it be reproduced?
Tomcat container with persistent connections to a memcached host should show the issue.
Logs, error output, etc
shard-cdr-7c6fb9d89d-jlhcd linkerd-debug 2252 31.968026877 10.21.40.53 → 10.42.1.35 TCP 68 11211 → 53312 [FIN, ACK] Seq=1 Ack=1 Win=43776 Len=0 TSval=222176710 TSecr=1190770320 shard-cdr-7c6fb9d89d-jlhcd linkerd-debug 2253 31.968202261 10.21.40.54 → 10.42.1.35 TCP 68 11211 → 37636 [FIN, ACK] Seq=1 Ack=1 Win=43776 Len=0 TSval=222176710 TSecr=1310269368 shard-cdr-7c6fb9d89d-jlhcd linkerd-debug 2809 91.370066554 10.21.40.55 → 10.42.1.35 TCP 56 11211 → 56552 [RST] Seq=2 Win=0 Len=0 shard-cdr-7c6fb9d89d-jlhcd linkerd-debug 2956 93.378264743 10.21.40.55 → 10.42.1.35 TCP 76 11211 → 58108 [SYN, ACK] Seq=0 Ack=1 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=222238121 TSecr=2296207651 WS=128 shard-cdr-7c6fb9d89d-jlhcd linkerd-debug 2972 96.379448285 10.21.40.55 → 10.42.1.35 TCP 68 11211 → 58108 [FIN, ACK] Seq=1 Ack=1 Win=43776 Len=0 TSval=222241122 TSecr=2296207651 shard-cdr-7c6fb9d89d-jlhcd tomcat 2020-06-29 15:26:07.758 INFO net.spy.memcached.MemcachedConnection: Reconnecting due to exception on {QA sa=shard1mem1.dev.youmail.com/10.21.40.55:11211, #Rops=1, #Wops=0, #iq=0, topRop=Cmd: set Key: ym.memc.inspect.560222747 Flags: 0 Exp: 172800 Data Length: 16, topWop=null, toWrite=0, interested=1} shard-cdr-7c6fb9d89d-jlhcd tomcat 2020-06-29 15:26:07.758 WARN net.spy.memcached.MemcachedConnection: Closing, and reopening {QA sa=shard1mem1.dev.youmail.com/10.21.40.55:11211, #Rops=1, #Wops=0, #iq=0, topRop=Cmd: set Key: ym.memc.inspect.560222747 Flags: 0 Exp: 172800 Data Length: 16, topWop=null, toWrite=0, interested=1}, attempt 0. shard-cdr-7c6fb9d89d-jlhcd linkerd-debug 6414 392.175887068 10.21.40.55 → 10.42.1.35 TCP 56 11211 → 58108 [RST] Seq=2 Win=0 Len=0 shard-cdr-7c6fb9d89d-jlhcd tomcat 2020-06-29 15:26:09.762 INFO net.spy.memcached.MemcachedConnection: Reconnecting {QA sa=shard1mem1.dev.youmail.com/10.21.40.55:11211, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} shard-cdr-7c6fb9d89d-jlhcd linkerd-debug 6462 394.180915041 10.21.40.55 → 10.42.1.35 TCP 76 11211 → 36332 [SYN, ACK] Seq=0 Ack=1 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=222538927 TSecr=2296508457 WS=128 shard-cdr-7c6fb9d89d-jlhcd linkerd-debug 6464 397.184152372 10.21.40.55 → 10.42.1.35 TCP 68 11211 → 36332 [FIN, ACK] Seq=1 Ack=1 Win=43776 Len=0 TSval=222541931 TSecr=2296508457
linkerd check
outputEnvironment
Possible solution
Looks like problem started in edge-20.4.2 and the issue might be this: