Open mboes opened 9 years ago
From @edsko on November 18, 2012 11:40
http://nitoprograms.blogspot.co.uk/2009/05/detection-of-half-open-dropped.html might provide some useful insights.
From @hyperthunk on December 13, 2012 20:49
See http://rabbitmq.1065348.n5.nabble.com/Re-The-rabbitmq-server-stop-command-hangs-td23180.html for an example of this happening in another network protocol. I have some thoughts about this. Firstly, it can be solved at the OS level by tuning kernel params and/or switching on TCP keep-alive, but that only works for TCP.
Personally I think this should be handled at the NT level and that it should be configurable.
From @edsko on December 13, 2012 21:38
Personally I think this should be handled at the NT level and that it should be configurable.
Agreed.
From @hyperthunk on December 14, 2012 0:28
We do this in RabbitMQ so I've some familiarity with the problem space. I'll take a look at submitting a patch, but things are rather busy at the moment so it might not materialise for a week or so.
From @hyperthunk on December 17, 2012 2:3
so it might not materialise for a week or so.
Nah, it's going to be quite a bit longer than that before I even start thinking about this one. Still interested in picking it up though => assigning to myself unless someone else wants to come and steal it first. :)
From @edsko on September 24, 2012 12:38
When one CH process A monitors another B, it expects to be notified if the connection between them breaks, even when A never sends anything to B (but only receives messages from B). This means that it is not enough to rely on
send
to detect network problems. This can be solved at the CH level or at the NT level.Copied from original issue: haskell-distributed/distributed-process#32