It is possible for socket.shutdown or socket.close to throw OSErrors if the underlying file descriptor is no longer valid. A typical scenario where this could occur is when the remote party of a TCP connection closes the connection and the kernel cleans up the descriptor before the local party can call TCPTransport.close. This error is now caught, as the close effectively did what it intended to do: make sure the socket is closed.
TCPTransport.close may be called multiple times, typically when both sides of a TCP connection try to close at the same time. In this case, both _base_receive and some user code call close in any order, resulting in an AttributeError, as _sock would be set to None by the first call. This error is now guarded against by locking and checking that _sock is not None yet.
To Reproduce
To reproduce the OSError: close a TCP connection created by a TCPTransport a bunch of times from the remote node, then call close on the TCPTransport. Not sure how deterministic this is on other systems, but in our system this was almost guaranteed to cause this problem.
The reproduce the AttributeError: simply call close multiple times. For a more realistic scenario, have the remote party and the local party attempt to close the TCPTransport at roughly the same times.
Expected behavior
These errors are not propagated to the user.
Desktop (please complete the following information):
OS: Docker image debian:bookworm-slim with eRPC installed.
eRPC Version: v1.13.0
Steps you didn't forgot to do
[x] I checked if other PR isn't solving this issue.
[x] I read Contribution details and did appropriate actions.
[x] PR code is tested.
[x] PR code is formatted.
[x] Allow edits from maintainers pull request option is set (recommended).
Pull request
Choose Correct
Describe the pull request
This PR fixes two bugs:
socket.shutdown
orsocket.close
to throwOSError
s if the underlying file descriptor is no longer valid. A typical scenario where this could occur is when the remote party of a TCP connection closes the connection and the kernel cleans up the descriptor before the local party can callTCPTransport.close
. This error is now caught, as theclose
effectively did what it intended to do: make sure the socket is closed.TCPTransport.close
may be called multiple times, typically when both sides of a TCP connection try to close at the same time. In this case, both_base_receive
and some user code callclose
in any order, resulting in anAttributeError
, as_sock
would be set toNone
by the first call. This error is now guarded against by locking and checking that_sock
is notNone
yet.To Reproduce
OSError
: close a TCP connection created by aTCPTransport
a bunch of times from the remote node, then callclose
on theTCPTransport
. Not sure how deterministic this is on other systems, but in our system this was almost guaranteed to cause this problem.AttributeError
: simply callclose
multiple times. For a more realistic scenario, have the remote party and the local party attempt to close theTCPTransport
at roughly the same times.Expected behavior
These errors are not propagated to the user.
Desktop (please complete the following information):
debian:bookworm-slim
with eRPC installed.Steps you didn't forgot to do