infinyon / fluvio

Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
https://www.fluvio.io/
Apache License 2.0
3.88k stars 491 forks source link

[Bug]: Fluvio client does not detect broken connection while consuming after machine suspend and resume #1514

Closed ajhunyady closed 1 year ago

ajhunyady commented 3 years ago

I started a client instance to listen on a topic/partition

fluvio consume fluvio-stars -k -B

[2021-08-26 19:12:15] 600 [2021-08-26 19:14:17] 600 [2021-08-26 19:16:18] 600 [2021-08-26 19:18:20] 600 [2021-08-26 19:20:22] 600 [2021-08-26 19:22:24] 600 [2021-08-26 19:24:25] 600 [2021-08-26 19:26:27] 600 [2021-08-26 19:28:29] 600

After a while I closed the laptop. When I returned, the connection looked alive... but soon realized that it does not receive traffic.

Expected behavior

(the minimum)

(better)

tjtelan commented 3 years ago

Caused by #770

tjtelan commented 2 years ago

This should be approached as an automated test.

Start consume, simulate a network failure (e.g. delete the spu pod, and wait for pod to redeploy), verify reconnection and resuming consuming

nacardin commented 2 years ago

I'll check behavior on Mac laptop on 0.9.13

nacardin commented 2 years ago

On MacOS, this behavior is still present as of 0.9.13. Closing the laptop for 1 minutes is fine, consume is still alive. If I close the laptop for about 10 minutes, the consume command is still running, but is not showing new records.

nacardin commented 2 years ago

Let's check netstat to see the state of the socket during this behavior

morenol commented 2 years ago

After suspend and resume in linux:

$ sudo netstat  -ntpe4     | grep 900
tcp        0      0 192.168.1.105:57768     3.211.184.49:9003       ESTABLISHED 1000       146968     15068/fluvio        
tcp        0      0 192.168.1.105:38622     3.211.184.49:9005       ESTABLISHED 1000       144072     15068/fluvio        
sehz commented 2 years ago

defering

github-actions[bot] commented 1 year ago

Stale issue message