confluentinc / librdkafka

The Apache Kafka C/C++ library
Other
7.36k stars 3.11k forks source link

Mock cluster's main thread does not retry when `poll()` returns EINTR, resulting in shutdown #4727

Open terryburton opened 1 month ago

terryburton commented 1 month ago

Description

External events such as signals and activities performed by other threads may routinely interrupt the mock cluster thread's poll() system call.

This situation is currently treated as an error resulting in shutdown of the mock cluster. It would be better to retry if poll() returns EINTR.

There may be equivalent considerations for win32.

Context: The mock cluster has been initialised within an application that subsequently calls setresuid (which interrupts system calls).

Checklist

Producer logs

Thu May 23 16:56:57 2024 : Info: [thrd:app]: Mock cluster enabled...
...
[ Mock cluster thread's poll() system call interrupted, returning EINTR. Treated as error, rather than retried. ]
Thu May 23 16:56:58 2024 : Error: [thrd:mock]: Mock cluster failed to poll 5 fds: -1: Interrupted system call
[ Results in teardown of the mock cluster. ]
...
Thu May 23 16:56:58 2024 : Error: [thrd:127.0.0.1:34819/bootstrap]: 127.0.0.1:34819/1: Connect to ipv4#127.0.0.1:34819 failed: Connection refused (after 0ms in state CONNECT)
Thu May 23 16:56:59 2024 : Error: [thrd:127.0.0.1:43461/bootstrap]: 127.0.0.1:43461/3: Connect to ipv4#127.0.0.1:43461 failed: Connection refused (after 0ms in state CONNECT)
Thu May 23 16:56:59 2024 : Error: [thrd:127.0.0.1:43461/bootstrap]: 3/3 brokers are down
...