Closed bitwalker closed 6 years ago
On it
It works! But with a wrinkle. The node connection successfully syncs the previously registered name.
But I get an unrecognized cast:
iex(aa@plasma)1> Node.connect :bb@plasma
true
iex(aa@plasma)2>
18:25:29.449 [info] [swarm on aa@plasma] [tracker:ensure_swarm_started_on_remote_node] nodeup bb@plasma
18:25:29.449 [info] [swarm on aa@plasma] [tracker:cluster_wait] joining cluster..
18:25:29.449 [info] [swarm on aa@plasma] [tracker:cluster_wait] found connected nodes: [:bb@plasma]
18:25:29.449 [info] [swarm on aa@plasma] [tracker:cluster_wait] selected sync node: bb@plasma
18:25:29.470 [info] [swarm on aa@plasma] [tracker:syncing] there is a tie between syncing nodes, breaking with die roll (13)..
18:25:29.470 [info] [swarm on aa@plasma] [tracker:syncing] there is a tie between syncing nodes, breaking with die roll (9)..
18:25:29.470 [info] [swarm on aa@plasma] [tracker:syncing] we won the die roll (9 vs 2), sending registry..
18:25:29.471 [info] [swarm on aa@plasma] [tracker:awaiting_sync_ack] received sync acknowledgement from bb@plasma
18:25:29.471 [info] [swarm on aa@plasma] [tracker:resolve_pending_sync_requests] pending sync requests cleared
18:25:29.477 [warn] [swarm on aa@plasma] [tracker:handle_cast] unrecognized cast: {:sync_end_tiebreaker, #PID<18024.226.0>, 13, 7}
Definitely an ignorable message for the moment, since we work with the first die roll which breaks a tie, and the transition out of the syncing state is why that cast is unhandled. However, we probably should be choosing one node or the other deterministically during the tiebreaking process so we only roll once, avoiding that second roll.
I'll merge this for now, and make a note to address that in a follow on PR
@pragdave When you get a chance, could you send me the log from bb@plasma
from the above test you ran? I'd like to trace back where that extra die roll is being triggered.
[image: Inline image 1]
On Wed, Jan 31, 2018 at 6:50 PM, Paul Schoenfelder <notifications@github.com
wrote:
@pragdave https://github.com/pragdave When you get a chance, could you send me the log from bb@plasma from the above test you ran? I'd like to trace back where that extra die roll is being triggered.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/bitwalker/swarm/pull/64#issuecomment-362121431, or mute the thread https://github.com/notifications/unsubscribe-auth/AAApmHQgOeFlqGnNezIXbhZPp9gMF7zFks5tQQpxgaJpZM4R03Pf .
@pragdave For some reason the image isn't showing for me :(
When joining a cluster after initial startup, the tracker will be in tracking state, and needs an opportunity to sync with the cluster when it joins. Previously, this synchronization only happened during startup or during anti-entropy passes, but this commit ensures that if this initial join occurs during tracking, that it is caught and handled like the cluster_wait -> cluster_join transition, ensuring a sync right away.
See #62
@slashdotdash Can you review this and give me your thoughts? If you could do some testing, that would help, as I'm pretty swamped right now.
@pragdave If you can and want to, could you pull this branch (cluster_forms_while_tracking) and try to replicate #62? If this fixes the problem, I will merge it and push a new release.