Open facundominguez opened 5 years ago
The problem is that the receiver has no idea whether msg is Ok or not. It just reads the specified number of bytes received from the 1st msg chunk (see recvWithLength). So I think the right solution would be at the sender side to send the whole msg and avoid being interrupted on sendMany by some outer non-IOException (like ProcessLinkException which happens quite often on our setup). This can be done by calling sendMany in a separate green-thread. The OS buffers become available quite fast, usually. And even if the parent thread will die in a meanwhile because of an exception - it should not take long for the child to finish.
See the proposed patch at https://github.com/haskell-distributed/network-transport-tcp/pull/86.
It seem interrupting send calls of the TCP transport can sometimes produce the receiver side to append incomplete messages with unrelated messages and deliver that to upper layers.
Either interrupted messages should become impossible, or the receiver should be stopped from delivering these messages upwards.
See the discussion in https://github.com/haskell-distributed/network-transport-tcp/commit/9ec9c1af4143bb4de952c6678ac46afe782cf740#r33404578
cc @andriytk