envoyproxy / envoy

Cloud-native high-performance edge/middle/service proxy
https://www.envoyproxy.io
Apache License 2.0
25.11k stars 4.82k forks source link

how about tcp proxy update when listener update #30037

Closed id-id-id closed 1 year ago

id-id-id commented 1 year ago

If you are reporting any crash or any potential security issue, do not open an issue in this repo. Please report the issue via emailing envoy-security@googlegroups.com where the issue will be triaged appropriately.

Title: how about tcp proxy update in lds update

Description: We use istio 1.16.4 and enable sidecar for our service. We have change accesslog config and cause listener update. We find that in the accesslog of envoy there is many responce code is 0, most of them is about cassandra/mysql/redis/kafka. We also find that listener for database or kafka is update and old listener is draning. But in envoy doc,it shows that tcp proxy is not supported draning(listener file chain config in envoy of kafak/database is tcp_proxy).

So we want to know how about tcp proxy update when listener update?

[optional Relevant Links:]

Any extra documentation required to understand the issue.

kyessenov commented 1 year ago

Listener update always causes draining but it's only done gracefullly for protocols that support it (e.g. HTTP2 GOAWAY frame). A non-graceful drain will abruptly terminate the connection eventually, which is why you see the response code 0.

id-id-id commented 1 year ago

I have use tcp dump to watch the change in listener update. I noticed a strange phenomenon.The following is the result of packet capture. No Time Source Destination Protocol Lengh Info 69483 19:15:08.085752 service pod ip kafka pod ip TCP 66 49710 → 9092 [FIN, ACK] Seq=17753 Ack=14973 Win=2190 Len=0 TSval=3046944231 TSecr=3833115671 69999 19:15:08.129126 kafka pod ip service pod ip TCP 66 9092 → 49710 [ACK] Seq=14973 Ack=17754 Win=32768 Len=0 TSval=3833116135 TSecr=3046944231 70389 19:15:08.173839 kafka pod ip service pod ip TCP 74 9092 → 49710 [PSH, ACK] Seq=14973 Ack=17754 Win=32768 Len=8 TSval=3833116180 TSecr=3046944231 [TCP segment of a reassembled PDU] 70391 19:15:08.173840 kafka pod ip service pod ip TCP 80 9092 → 49710 [PSH, ACK] Seq=14981 Ack=17754 Win=32768 Len=14 TSval=3833116180 TSecr=3046944231 [TCP segment of a reassembled PDU] 70390 19:15:08.173856 service pod ip kafka pod ip TCP 54 49710 → 9092 [RST] Seq=17754 Win=0 Len=0 70392 19:15:08.173862 service pod ip kafka pod ip TCP 54 49710 → 9092 [RST] Seq=17754 Win=0 Len=0 70393 19:15:08.173874 kafka pod ip service pod ip Kafka 142 Kafka Fetch v11 Response 70394 19:15:08.173877 service pod ip kafka pod ip TCP 54 49710 → 9092 [RST] Seq=17754 Win=0 Len=0 70395 19:15:08.174307 kafka pod ip service pod ip TCP 66 9092 → 49710 [FIN, ACK] Seq=15071 Ack=17754 Win=32768 Len=0 TSval=3833116180 TSecr=3046944231 70396 19:15:08.174314 service pod ip kafka pod ip TCP 54 49710 → 9092 [RST] Seq=17754 Win=0 Len=0

When my service send FIN to kafka and receiver ACK from kafka. Then kafka send data to my servcie, but my service return rst to kafka. It seems that my servcie has forcibly closed the connection, so the port is not listened。However, this process does not conform to tcp's process of closing the connection.

Is this the design of envoy to update listener which network filter is tcp_proxy?

id-id-id commented 1 year ago

Recently, I have read code of envoy about draning. There is some questions.

1, I find that when draning listener, it will stop to accepting new connections. For TCP listener, how what happen to old connections? They can be used normally? 2, After stopListener, it will wait drain-time before remove listener. While waiting, both the client and the server continue to use the existing tcp connection for data interaction.These connections are broken until the function named removeListener is started(The time when the tcp connection started to close was the same as the time when the log showed that the removeListener started). What is the whole process of TCP listener draning? (StopListener will callback another function named maybeCloseSocketsForListener which seem to close socket for listener will be drained. But according to tcpdump result my watch, the connection is used continue until listener be removed)

@kyessenov

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.

github-actions[bot] commented 1 year ago

This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted" or "no stalebot". Thank you for your contributions.