uwiger / locks

A scalable, deadlock-resolving resource locker
Mozilla Public License 2.0
204 stars 26 forks source link

add locks_leader:maybe_become_leader/2 #8

Closed uwiger closed 9 years ago

uwiger commented 9 years ago

Attempt to resolve Issue #7

  1. I haven't been able to reproduce the problem with hanging leader candidates, but It's possible that a leader sets the leader attribute to undefined in response to a netsplit, and then hopes to get a have_all_locks message. However, if the agent has already sent such a message, it won't send it again, unless asked to. However, a #locks_info{} message should arrive at any rate. If we are in the safe_loop(), we check whether we have the lock, and our leader attribute is set to undefined. If so, we call become_leader().
  2. The crash in locks_agent:get_locks() was due to locks being deleted on nodedown, that may have been logged in the interesting list. This list is an optimization, keeping track of locks that have a queue of more than one agent (the only ones we need to worry about for dependency analysis). When removing all locks on a certain node, or all instances of a certain object, make sure to also delete them from interesting.
uwiger commented 9 years ago

Sorry about the delay. The unit test, including the 5-node netsplit scenario for locks_leader, now passes.

Apparently, the 'majority_alive' lock requirement which was used by locks_leader before, is iffy, esp. when netsplit has occurred and nodes have different views of how many nodes are alive. This transition may take a lot longer than I expected, and leads to sustained competition for locks.

uwiger commented 9 years ago

If there are no objections, I plan to merge this into master soon.

edgurgel commented 9 years ago

This could be used to update the uw-locks-branch on gproc?

edgurgel commented 9 years ago

I just noticed this branch is using HEAD of locks. Forget :D

uwiger commented 9 years ago

I just noticed this branch is using HEAD of locks.

For now it does. I plan to change that. :)