metaborg / nabl

Spoofax' Name Binding Language
Apache License 2.0
7 stars 12 forks source link

Don't detect deadlocks in nodes depending on knots. #75

Closed AZWN closed 2 years ago

AZWN commented 3 years ago

Short description Detect deadlock on nodes that are part of the deadlocked SCC in the dependency graph.

Problem description. The concurrent solver uses Chandy-Misra-Haas (CMH) communication deadlock detection algorithm [1] to detect deadlocks. Communication deadlocks occur when there are knots in the wait-for graph.

However, the current implementation allows deadlocks to be detected on nodes that are not part of the knot, but rather only have dependencies on nodes in the knot. For example, consider the following wait-for graph:

1 -> [1, 2, 3]
2 -> [2, 3]
3 -> [2, 3]

(which should be read as: node 1 waits for node 1, 2 and 3, etc...) In this situation, the actual deadlock occurs in the knot consisting of nodes {2, 3}. However, when node 1 is the last node to become idle, this node will actually detect a deadlock consisting of {1, 2, 3} (as demonstrated in these tests).

From a pragmatic point of view, this is not too much of a problem, because either node 2 or 3 will detect a deadlock of {2, 3}, earlier than node 1. Hence, in case of release/restart, the correct choice will still be made. However, this relies on the fact that messages are processed in order of arrival, while we formally allow reordering of messages when they originate from different units.

Describe the solution you'd like In the above example, only the deadlock of {2, 3} should lead to deadlock resolution being executed. This can be done in three different ways:

  1. Using a deadlock detection algorithm that only detects SCCs by construction.

  2. By post-processing when deadlock is detected. This post-processing (on node n) only has to check if all other nodes can reach the detecting node via the wait-for graph (which is equivalent to checking whether n can reach all other nodes in the reverse WF-graph). The opposite is already guaranteed by the deadlock detection itself.

  3. By using a distributed SCC algorithm, to detect the smallest set of deadlocked units.

References [1] Chandy, K. Mani, Jayadev Misra, and Laura M. Haas. “Distributed Deadlock Detection.” ACM Transactions on Computer Systems 1, no. 2 (May 1, 1983): 144–156. https://doi.org/10.1145/357360.357365.