ClusterLabs / resource-agents

Combined repository of OCF agents from the RHCS and Linux-HA projects
GNU General Public License v2.0
489 stars 577 forks source link

IPv6addr: expect ping/pong delay #1858

Closed rfuchs closed 1 year ago

rfuchs commented 1 year ago

Under heavy network load, the echo response to an echo request that was just sent may not immediately be available for reading, with recvmsg(MSG_DONTWAIT) failing with EAGAIN. This leads to occasional false positive "not running" events.

This wraps the recvmsg() within a poll() loop with a short timeout (10 ms) and retries reading the echo response up to 3 times, in case poll() was interrupted by some other event (e.g. EINTR).

Closes #1855

knet-ci-bot commented 1 year ago

Can one of the admins verify this patch?

oalbrigt commented 1 year ago

ok to test

rfuchs commented 1 year ago

Any opinions on this? FTR we're actively hitting this bug in some of our deployments.

oalbrigt commented 1 year ago

Thanks.