haskell-distributed / distributed-process

Cloud Haskell core libraries
http://haskell-distributed.github.io
711 stars 96 forks source link

Fix testBreakConnection and undo workaround #250

Open facundominguez opened 9 years ago

facundominguez commented 9 years ago

The bug:

A process PA in NA monitors a process PB in NB
NA and NB are disconnected
PA tries to send a message to another process PB1 in NB
A monitor notification about PB1 death arrives to PA
No monitor notification about PB arrives to PA.

This is observed only in tests so far (MonitorNode, MonitorLiveNode, MonitorChannel from CH). These tests break connections with testBreakConnection, which doesn't deliver EventConnectionLost. If the transport delivered EventConnectionLost, then d-p would notify the death of all processes, I hope.

Instead, d-p has a patch that workarounds the problem: haskell-distributed#246

mboes commented 9 years ago

These tests break connections with testBreakConnection, which doesn't deliver EventConnectionLost isn't this a bug specific to the network-transport backend being used during tests?

facundominguez commented 9 years ago

Yes, probably so.

mboes commented 9 years ago

Then move issue to e.g. n-t-tcp? Does n-t-inmemory suffer from the same problem?

facundominguez commented 9 years ago

testBreakConnection is not implemented by n-t-tcp, but it is rather done in d-p-tests [1].

Addressing this probably requires implementing breakConnection in n-t-tcp (now haskell-distributed/distributed-process#434). n-t-inmemory already implements it and AFAICS it does deliver EventConnectionLost.

[1] https://github.com/haskell-distributed/distributed-process/blob/24cc188d83b74cd3094deaf0b94d99473c38c7b1/distributed-process-tests/tests/runTCP.hs#L25