omgnetwork / ewallet

eWallet Backend for the OmiseGO SDKs.
https://omisego.network/
Apache License 2.0
324 stars 74 forks source link

Swarm may cause deployment to fail during cluster joining #893

Open sirn opened 5 years ago

sirn commented 5 years ago

Swarm, used by Quantum, may cause rolling deployment to fail.

15:06:39.503 [info] [swarm on ewallet@10.40.4.141] [tracker:cluster_wait] joining cluster..
15:06:39.503 [info] [swarm on ewallet@10.40.4.141] [tracker:cluster_wait] found connected nodes: [:"ewallet@10.40.2.171", :"ewallet@10.40.1.161", :"ewallet@10.40.1.160"]
15:06:39.503 [info] [swarm on ewallet@10.40.4.141] [tracker:cluster_wait] selected sync node: ewallet@10.40.1.160
15:06:39.504 [info] [swarm on ewallet@10.40.4.141] [tracker:syncing] received registry from ewallet@10.40.1.160, merging..
15:06:39.504 [info] [swarm on ewallet@10.40.4.141] [tracker:syncing] local synchronization with ewallet@10.40.1.160 complete!
15:06:39.504 [info] [swarm on ewallet@10.40.4.141] [tracker:resolve_pending_sync_requests] pending sync requests cleared
15:06:49.661 [info] Application ewallet exited: EWallet.Application.start(:normal, []) returned an error: shutdown: failed to start child: EWallet.Scheduler
    ** (EXIT) exited in: :gen_statem.call(Swarm.Tracker, {:track, EWallet.Scheduler.TaskRegistry, %{mfa: {Quantum.TaskRegistry, :start_link, [%Quantum.TaskRegistry.StartOpts{name: EWallet.Scheduler.TaskRegistry}]}}}, 15000)
        ** (EXIT) time out
15:06:49.681 [error] GenStateMachine #PID<0.2599.0> terminating
** (exit) {:shutdown, :sender_died, :killed}
    (stdlib) gen_statem.erl:1158: :gen_statem.loop_event_result/9
    (ssl) tls_connection.erl:134: :tls_connection.init/1
    (stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3
Kernel pid terminated (application_controller) ({application_start_failure,ewallet,{{shutdown,{failed_to_start_child,'Elixir.EWallet.Scheduler',{timeout,{gen_statem,call,['Elixir.Swarm.Tracker',{track
{"Kernel pid terminated",application_controller,"{application_start_failure,ewallet,{{shutdown,{failed_to_start_child,'Elixir.EWallet.Scheduler',{timeout,{gen_statem,call,['Elixir.Swarm.Tracker',{track,'Elixir.EWallet.Scheduler.TaskRegistry',#{mfa => {'Elixir.Quantum.TaskRegistry',start_link,[#{'__struct__' => 'Elixir.Quantum.TaskRegistry.StartOpts',name => 'Elixir.EWallet.Scheduler.TaskRegistry'}]}}},15000]}}}},{'Elixir.EWallet.Application',start,[normal,[]]}}}"}

Crash dump is being written to: erl_crash.dump...done
sirn commented 5 years ago

Guess: newer eWallet version (deployment, or whatever) tried to connect to older instance and crash.

sirn commented 5 years ago

Actually maybe this one https://github.com/quantum-elixir/quantum-core/issues/374