Closed rolandd closed 11 months ago
Hi,
Thanks for your contribution!. Can you place share a case where this race is likely to happen?
I don't have a specific case to reproduce this, but we have an automated smoke test that runs ib_send_lat
quite frequently and this bug is making the test flaky. I'm not sure if it's test infrastructure or what, but at least for us, one side of the test can run ahead enough that the socket is closed before the other side sends "done". Since there is no matching read()
or anything else that consumes the "done" message, I think it makes sense to delete this very old and buggy code.
Thanks for clarification, merged!
When rdmacm is not used, ctx_close_connection() does a handshake and then sends "done" over the socket before closing it. This is racy and can lead to a spurious error:
Fix this simply by deleting the write(). The ctx_hand_shake() already ensures the two sides are in sync and can proceed to the close().