Closed Powersource closed 12 months ago
Gonna write some more notes to try to figure out exactly how I should go about this.
Goals:
Questions:
Do we want to help others with their failed exclusions? Maybe yeah? Since the group/epoch is a common resource and one person crashing might break the group for all of us.
Tough question, but I'm also leaning towards anyone helping proceed with the exclusion, simply because that dangling exclude-member msg may be confusing.
- When do we fix a broken state? When calling the function again? If other people should be able to fix it too, then they'll want a listener. Do we want to use that listener for ourselves as well? A listener would only check again on restart, is that fine?
Being eager about it shouldn't be a problem, because of the "same membership" forked epoch resolution. So if admin A tried to exclude Oscar but stopped in between, then admins B and C can proceed to do it, and they will create two forked epochs, but they'll have the same membership set, and then tie breaking rule applies.
In terms of code, I don't know how to organize it.
Tough question, but I'm also leaning towards anyone helping proceed with the exclusion, simply because that dangling exclude-member msg may be confusing.
Yeah I think I basically ended up going with being agnostic towards who made the breaking state.
Being eager about it shouldn't be a problem, because of the "same membership" forked epoch resolution.
I think I was about to try the eager solution as well but ended up deciding against it, since most/all the recovery logic uses long-ish timeouts in it, which would make regular function usage way too slow.
For exclusion we post 3 different messages
group/exclude-member
message. We don't need to recover from this, if we crash on this step, the user can tell and they just have to try again.group/init
to init the new epoch. Hopefully it's enough to look for a 1. msg. But hmm when should we search for that? If we call excludeMembers again with the exact same args? Should excludeMembers maybe just post exclude-member, and msgs 2. and 3. should be left to listeners?group/add-member
messages. The lib/epoch functiongetMissingMembers
is probably very helpful here.Todos: