Open mgwidmann opened 5 years ago
Hi Matt,
Thank for the bug report. There is something about the shutdown process that doesn't make 100% sense. I believe this is during a regular shutdown.
It would be very helpful if you could reproduce this in a test in Horde. Can I ask you to make a PR with that? Eventually I want to have some good tests using local_cluster
(and later schism
) to test these scenarios.
So local_cluster
starts new nodes for the test and then shuts them down which would make sense if this occurred during the shutdown phase. After the process crashes it is restarted and everything is better.
I can try to reproduce the issue in the example application though I'm not sure what I'm doing differently to make this happen. Running mix test
is the most reliable way to reproduce this error for me.
Here is the entire output from mix test
. As you can see the crash occurs several times, when the machines are connecting and after connection. The sequence events as I follow it go like this:
✗ mix test
2019-08-14T13:52:27.834000 [info] Starting Horde.RegistryImpl with name MyApplication.Registry
2019-08-14T13:52:27.898000 [info] Starting Horde.RegistryImpl with name MyApplication.EntityRegistry1
2019-08-14T13:52:27.898000 [info] Starting Horde.RegistryImpl with name MyApplication.EntityRegistry2
2019-08-14T13:52:27.899000 [info] Starting Horde.RegistryImpl with name MyApplication.EntityRegistry3
2019-08-14T13:52:27.899000 [info] Starting Horde.RegistryImpl with name MyApplication.EntityRegistry4
2019-08-14T13:52:27.910000 [info] Starting Horde.SupervisorImpl with name MyApplication.DistributedSupervisor
2019-08-14T13:52:27.910000 [info] Starting Horde.SupervisorImpl with name MyApplication.EntitySupervisor1
2019-08-14T13:52:27.911000 [info] Starting Horde.SupervisorImpl with name MyApplication.EntitySupervisor2
2019-08-14T13:52:27.911000 [info] Starting Horde.SupervisorImpl with name MyApplication.EntitySupervisor3
2019-08-14T13:52:27.911000 [info] Starting Horde.SupervisorImpl with name MyApplication.EntitySupervisor4
2019-08-14T13:52:27.915000 [info] Starting up Elixir.MyApplication.Worker with 4 schedulers online
2019-08-14T13:52:28.790000 [error] GenServer MyApplication.DistributedSupervisor terminating
** (stop) exited in: GenServer.stop(MyApplication.DistributedSupervisor.ProcessesSupervisor, :normal, :infinity)
** (EXIT) no process: the process is not alive or there's no process currently associated with the given name, possibly because its application isn't started
(elixir) lib/gen_server.ex:938: GenServer.stop/3
(horde) lib/horde/supervisor_impl.ex:571: Horde.SupervisorImpl.shut_down_all_processes/1
(horde) lib/horde/supervisor_impl.ex:347: Horde.SupervisorImpl.handle_info/2
(stdlib) gen_server.erl:637: :gen_server.try_dispatch/4
(stdlib) gen_server.erl:711: :gen_server.handle_msg/6
(stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3
Last message: {:crdt_update, [{:add, {:member, {MyApplication.DistributedSupervisor, :"test-my-application1@127.0.0.1"}}, 1}]}
State: %Horde.SupervisorImpl{distribution_strategy: Horde.UniformQuorumDistribution, members: %{{MyApplication.DistributedSupervisor, :"manager@127.0.0.1"} => 1, {MyApplication.DistributedSupervisor, :"test-my-application1@127.0.0.1"} => 1}, members_info: %{{MyApplication.DistributedSupervisor, :"manager@127.0.0.1"} => %Horde.Supervisor.Member{name: {MyApplication.DistributedSupervisor, :"manager@127.0.0.1"}, status: :alive}, {MyApplication.DistributedSupervisor, :"test-my-application1@127.0.0.1"} => %Horde.Supervisor.Member{name: {MyApplication.DistributedSupervisor, :"test-my-application1@127.0.0.1"}, status: :uninitialized}}, name: MyApplication.DistributedSupervisor, name_to_supervisor_ref: %{{MyApplication.DistributedSupervisor, :"manager@127.0.0.1"} => #Reference<0.3609365875.1531183106.30378>}, process_pid_to_id: %{#PID<0.370.0> => 259476575823192166570427768043200980535}, processes_by_id: %{259476575823192166570427768043200980535 => {{MyApplication.DistributedSupervisor, :"manager@127.0.0.1"}, %{id: 259476575823192166570427768043200980535, restart: :permanent, shutdown: 10000, start: {MyApplication.ScannerResultWriter, :start_link, []}}, #PID<0.370.0>}}, processes_updated_at: 0, processes_updated_counter: 0, shutting_down: false, supervisor_options: [members: [{MyApplication.DistributedSupervisor, :"manager@127.0.0.1"}], name: MyApplication.DistributedSupervisor, root_name: MyApplication.DistributedSupervisor, init_module: MyApplication.DistributedSupervisor, id: MyApplication.DistributedSupervisor, strategy: :one_for_one, distribution_strategy: Horde.UniformQuorumDistribution, max_restarts: 1000, max_seconds: 1], supervisor_ref_to_name: %{#Reference<0.3609365875.1531183106.30378> => {MyApplication.DistributedSupervisor, :"manager@127.0.0.1"}}, waiting_for_quorum: []}
2019-08-14T13:52:28.790000 [error] GenServer #PID<0.339.0> terminating
** (stop) exited in: GenServer.call(MyApplication.DistributedSupervisor, :horde_shutting_down, 5000)
** (EXIT) exited in: GenServer.stop(MyApplication.DistributedSupervisor.ProcessesSupervisor, :normal, :infinity)
** (EXIT) no process: the process is not alive or there's no process currently associated with the given name, possibly because its application isn't started
(elixir) lib/gen_server.ex:989: GenServer.call/3
(horde) lib/horde/signal_shutdown.ex:21: anonymous fn/1 in Horde.SignalShutdown.terminate/2
(elixir) lib/enum.ex:769: Enum."-each/2-lists^foreach/1-0-"/2
(elixir) lib/enum.ex:769: Enum.each/2
(stdlib) gen_server.erl:673: :gen_server.try_terminate/3
(stdlib) gen_server.erl:858: :gen_server.terminate/10
(stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3
Last message: {:EXIT, #PID<0.334.0>, :shutdown}
State: [MyApplication.DistributedSupervisor.GracefulShutdownManager, MyApplication.DistributedSupervisor]
2019-08-14T13:52:28.836000 [info] Starting Horde.SupervisorImpl with name MyApplication.DistributedSupervisor
09:52:31.472 [info] Starting Horde.RegistryImpl with name MyApplication.Registry
09:52:31.472 [info] Starting Horde.RegistryImpl with name MyApplication.Registry
09:52:31.474 [info] Starting Horde.RegistryImpl with name MyApplication.EntityRegistry1
09:52:31.474 [info] Starting Horde.RegistryImpl with name MyApplication.EntityRegistry1
09:52:31.474 [info] Starting Horde.RegistryImpl with name MyApplication.EntityRegistry2
09:52:31.474 [info] Starting Horde.RegistryImpl with name MyApplication.EntityRegistry2
09:52:31.475 [info] Starting Horde.RegistryImpl with name MyApplication.EntityRegistry3
09:52:31.475 [info] Starting Horde.RegistryImpl with name MyApplication.EntityRegistry3
09:52:31.475 [info] Starting Horde.RegistryImpl with name MyApplication.EntityRegistry4
09:52:31.475 [info] Starting Horde.RegistryImpl with name MyApplication.EntityRegistry4
09:52:31.538 [info] Starting Horde.SupervisorImpl with name MyApplication.DistributedSupervisor
09:52:31.538 [info] Starting Horde.SupervisorImpl with name MyApplication.EntitySupervisor1
09:52:31.537 [info] Starting Horde.SupervisorImpl with name MyApplication.DistributedSupervisor
09:52:31.538 [info] Starting Horde.SupervisorImpl with name MyApplication.EntitySupervisor1
09:52:31.539 [info] Starting Horde.SupervisorImpl with name MyApplication.EntitySupervisor2
09:52:31.539 [info] Starting Horde.SupervisorImpl with name MyApplication.EntitySupervisor2
09:52:31.539 [info] Starting Horde.SupervisorImpl with name MyApplication.EntitySupervisor3
09:52:31.540 [info] Starting Horde.SupervisorImpl with name MyApplication.EntitySupervisor3
09:52:31.540 [info] Starting Horde.SupervisorImpl with name MyApplication.EntitySupervisor4
09:52:31.541 [info] Starting Horde.SupervisorImpl with name MyApplication.EntitySupervisor4
09:52:31.591 [info] Starting up Elixir.MyApplication.Worker with 4 schedulers online
09:52:31.591 [info] Starting up Elixir.MyApplication.Worker with 4 schedulers online
warning: redefining module MyApplication.DistributedTest (current version defined in memory)
test/my_application/distributed_test.exs:1
Test nodes that are now online: [:"test-my-application1@127.0.0.1", :"test-my-application2@127.0.0.1"]
.
2019-08-14T13:52:31.997000 [error] GenServer MyApplication.EntitySupervisor1 terminating
** (stop) exited in: GenServer.stop(MyApplication.EntitySupervisor1.ProcessesSupervisor, :normal, :infinity)
** (EXIT) no process: the process is not alive or there's no process currently associated with the given name, possibly because its application isn't started
(elixir) lib/gen_server.ex:938: GenServer.stop/3
(horde) lib/horde/supervisor_impl.ex:571: Horde.SupervisorImpl.shut_down_all_processes/1
(horde) lib/horde/supervisor_impl.ex:347: Horde.SupervisorImpl.handle_info/2
(stdlib) gen_server.erl:637: :gen_server.try_dispatch/4
(stdlib) gen_server.erl:711: :gen_server.handle_msg/6
(stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3
Last message: {:crdt_update, [{:add, {:member_node_info, {MyApplication.EntitySupervisor1, :"manager@127.0.0.1"}}, %Horde.Supervisor.Member{name: {MyApplication.EntitySupervisor1, :"manager@127.0.0.1"}, status: :shutting_down}}]}
State: %Horde.SupervisorImpl{distribution_strategy: Horde.UniformQuorumDistribution, members: %{{MyApplication.EntitySupervisor1, :"manager@127.0.0.1"} => 1, {MyApplication.EntitySupervisor1, :"test-my-application2@127.0.0.1"} => 1}, members_info: %{{MyApplication.EntitySupervisor1, :"manager@127.0.0.1"} => %Horde.Supervisor.Member{name: {MyApplication.EntitySupervisor1, :"manager@127.0.0.1"}, status: :alive}, {MyApplication.EntitySupervisor1, :"test-my-application2@127.0.0.1"} => %Horde.Supervisor.Member{name: {MyApplication.EntitySupervisor1, :"test-my-application2@127.0.0.1"}, status: :dead}}, name: MyApplication.EntitySupervisor1, name_to_supervisor_ref: %{{MyApplication.EntitySupervisor1, :"manager@127.0.0.1"} => #Reference<0.3609365875.1531183106.30390>}, process_pid_to_id: %{#PID<0.628.0> => 314042693166365608091164111263343981781, #PID<20246.738.0> => 221419068907050945833165595071189253006, #PID<20100.828.0> => 314042693166365608091164111263343981781}, processes_by_id: %{221419068907050945833165595071189253006 => {{MyApplication.EntitySupervisor1, :"test-my-application2@127.0.0.1"}, %{id: 221419068907050945833165595071189253006, restart: :transient, shutdown: 10000, start: {MyApplication.IpScanner, :start_link, ["5.6.7.8"]}}, #PID<20246.738.0>}, 314042693166365608091164111263343981781 => {{MyApplication.EntitySupervisor1, :"manager@127.0.0.1"}, %{id: 314042693166365608091164111263343981781, restart: :transient, shutdown: 10000, start: {MyApplication.IpScanner, :start_link, ["1.2.3.4"]}}, #PID<0.628.0>}}, processes_updated_at: 0, processes_updated_counter: 0, shutting_down: true, supervisor_options: [members: [{MyApplication.EntitySupervisor1, :"manager@127.0.0.1"}], name: MyApplication.EntitySupervisor1, root_name: MyApplication.EntitySupervisor1, init_module: MyApplication.DistributedSupervisor, id: :ip_supervisor_1, strategy: :one_for_one, distribution_strategy: Horde.UniformQuorumDistribution, max_restarts: 1000, max_seconds: 1], supervisor_ref_to_name: %{#Reference<0.3609365875.1531183106.30390> => {MyApplication.EntitySupervisor1, :"manager@127.0.0.1"}}, waiting_for_quorum: []}
2019-08-14T13:52:31.997000 [info] Starting Horde.SupervisorImpl with name MyApplication.EntitySupervisor1
2019-08-14T13:52:31.998000 [error] GenServer MyApplication.Cluster.NodeListener terminating
** (stop) exited in: GenServer.call(MyApplication.EntitySupervisor1, {:set_members, [{MyApplication.EntitySupervisor1, :"manager@127.0.0.1"}]}, 5000)
** (EXIT) exited in: GenServer.stop(MyApplication.EntitySupervisor1.ProcessesSupervisor, :normal, :infinity)
** (EXIT) no process: the process is not alive or there's no process currently associated with the given name, possibly because its application isn't started
(elixir) lib/gen_server.ex:989: GenServer.call/3
(my_application) lib/my_application/cluster/node_listener.ex:30: anonymous fn/2 in MyApplication.Cluster.NodeListener.set_members/1
(elixir) lib/enum.ex:1940: Enum."-reduce/3-lists^foldl/2-0-"/3
(my_application) lib/my_application/cluster/node_listener.ex:24: MyApplication.Cluster.NodeListener.set_members/1
(my_application) lib/my_application/cluster/node_listener.ex:19: MyApplication.Cluster.NodeListener.handle_info/2
(stdlib) gen_server.erl:637: :gen_server.try_dispatch/4
(stdlib) gen_server.erl:711: :gen_server.handle_msg/6
(stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3
Last message: {:nodedown, :"test-my-application2@127.0.0.1", [node_type: :visible]}
State: [MyApplication.DistributedSupervisor, MyApplication.Registry, MyApplication.EntitySupervisor1, MyApplication.EntitySupervisor2, MyApplication.EntitySupervisor3, MyApplication.EntitySupervisor4, MyApplication.EntityRegistry1, MyApplication.EntityRegistry2, MyApplication.EntityRegistry3, MyApplication.EntityRegistry4]
2019-08-14T13:52:32.659000 [error] GenServer MyApplication.DistributedSupervisor terminating
** (stop) exited in: GenServer.stop(MyApplication.DistributedSupervisor.ProcessesSupervisor, :normal, :infinity)
** (EXIT) no process: the process is not alive or there's no process currently associated with the given name, possibly because its application isn't started
(elixir) lib/gen_server.ex:938: GenServer.stop/3
(horde) lib/horde/supervisor_impl.ex:571: Horde.SupervisorImpl.shut_down_all_processes/1
(horde) lib/horde/supervisor_impl.ex:347: Horde.SupervisorImpl.handle_info/2
(stdlib) gen_server.erl:637: :gen_server.try_dispatch/4
(stdlib) gen_server.erl:711: :gen_server.handle_msg/6
(stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3
Last message: {:crdt_update, [{:add, {:member, {MyApplication.DistributedSupervisor, :"test-my-application1@127.0.0.1"}}, 1}]}
State: %Horde.SupervisorImpl{distribution_strategy: Horde.UniformQuorumDistribution, members: %{{MyApplication.DistributedSupervisor, :"manager@127.0.0.1"} => 1, {MyApplication.DistributedSupervisor, :"test-my-application1@127.0.0.1"} => 1}, members_info: %{{MyApplication.DistributedSupervisor, :"manager@127.0.0.1"} => %Horde.Supervisor.Member{name: {MyApplication.DistributedSupervisor, :"manager@127.0.0.1"}, status: :alive}, {MyApplication.DistributedSupervisor, :"test-my-application1@127.0.0.1"} => %Horde.Supervisor.Member{name: {MyApplication.DistributedSupervisor, :"test-my-application1@127.0.0.1"}, status: :uninitialized}}, name: MyApplication.DistributedSupervisor, name_to_supervisor_ref: %{{MyApplication.DistributedSupervisor, :"manager@127.0.0.1"} => #Reference<0.3609365875.1531183105.27253>}, process_pid_to_id: %{#PID<0.638.0> => 224928341895112729021255116702455286098, #PID<20100.747.0> => 224928341895112729021255116702455286098}, processes_by_id: %{224928341895112729021255116702455286098 => {{MyApplication.DistributedSupervisor, :"manager@127.0.0.1"}, %{id: 224928341895112729021255116702455286098, restart: :permanent, shutdown: 10000, start: {MyApplication.ScannerResultWriter, :start_link, []}}, #PID<0.638.0>}}, processes_updated_at: 0, processes_updated_counter: 0, shutting_down: false, supervisor_options: [members: [{MyApplication.DistributedSupervisor, :"manager@127.0.0.1"}, {MyApplication.DistributedSupervisor, :"test-my-application1@127.0.0.1"}], name: MyApplication.DistributedSupervisor, root_name: MyApplication.DistributedSupervisor, init_module: MyApplication.DistributedSupervisor, id: MyApplication.DistributedSupervisor, strategy: :one_for_one, distribution_strategy: Horde.UniformQuorumDistribution, max_restarts: 1000, max_seconds: 1], supervisor_ref_to_name: %{#Reference<0.3609365875.1531183105.27253> => {MyApplication.DistributedSupervisor, :"manager@127.0.0.1"}}, waiting_for_quorum: []}
2019-08-14T13:52:32.660000 [error] GenServer #PID<0.429.0> terminating
** (stop) exited in: GenServer.call(MyApplication.DistributedSupervisor, :horde_shutting_down, 5000)
** (EXIT) exited in: GenServer.stop(MyApplication.DistributedSupervisor.ProcessesSupervisor, :normal, :infinity)
** (EXIT) no process: the process is not alive or there's no process currently associated with the given name, possibly because its application isn't started
(elixir) lib/gen_server.ex:989: GenServer.call/3
(horde) lib/horde/signal_shutdown.ex:21: anonymous fn/1 in Horde.SignalShutdown.terminate/2
(elixir) lib/enum.ex:769: Enum."-each/2-lists^foreach/1-0-"/2
(elixir) lib/enum.ex:769: Enum.each/2
(stdlib) gen_server.erl:673: :gen_server.try_terminate/3
(stdlib) gen_server.erl:858: :gen_server.terminate/10
(stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3
Last message: {:EXIT, #PID<0.334.0>, :shutdown}
State: [MyApplication.DistributedSupervisor.GracefulShutdownManager, MyApplication.DistributedSupervisor]
2019-08-14T13:52:32.709000 [info] Starting Horde.SupervisorImpl with name MyApplication.DistributedSupervisor
09:52:35.323 [info] Starting Horde.RegistryImpl with name MyApplication.Registry
09:52:35.323 [info] Starting Horde.RegistryImpl with name MyApplication.Registry
09:52:35.325 [info] Starting Horde.RegistryImpl with name MyApplication.EntityRegistry1
09:52:35.325 [info] Starting Horde.RegistryImpl with name MyApplication.EntityRegistry1
09:52:35.326 [info] Starting Horde.RegistryImpl with name MyApplication.EntityRegistry2
09:52:35.326 [info] Starting Horde.RegistryImpl with name MyApplication.EntityRegistry2
09:52:35.327 [info] Starting Horde.RegistryImpl with name MyApplication.EntityRegistry3
09:52:35.327 [info] Starting Horde.RegistryImpl with name MyApplication.EntityRegistry4
09:52:35.327 [info] Starting Horde.RegistryImpl with name MyApplication.EntityRegistry3
09:52:35.327 [info] Starting Horde.RegistryImpl with name MyApplication.EntityRegistry4
09:52:35.386 [info] Starting Horde.SupervisorImpl with name MyApplication.DistributedSupervisor
09:52:35.386 [info] Starting Horde.SupervisorImpl with name MyApplication.DistributedSupervisor
09:52:35.386 [info] Starting Horde.SupervisorImpl with name MyApplication.EntitySupervisor1
09:52:35.387 [info] Starting Horde.SupervisorImpl with name MyApplication.EntitySupervisor2
09:52:35.387 [info] Starting Horde.SupervisorImpl with name MyApplication.EntitySupervisor3
09:52:35.386 [info] Starting Horde.SupervisorImpl with name MyApplication.EntitySupervisor1
09:52:35.388 [info] Starting Horde.SupervisorImpl with name MyApplication.EntitySupervisor4
09:52:35.387 [info] Starting Horde.SupervisorImpl with name MyApplication.EntitySupervisor2
09:52:35.388 [info] Starting Horde.SupervisorImpl with name MyApplication.EntitySupervisor3
09:52:35.389 [info] Starting Horde.SupervisorImpl with name MyApplication.EntitySupervisor4
09:52:35.439 [info] Starting up Elixir.MyApplication.Worker with 4 schedulers online
09:52:35.439 [info] Starting up Elixir.MyApplication.Worker with 4 schedulers online
warning: redefining module MyApplication.DistributedTest (current version defined in memory)
test/my_application/distributed_test.exs:1
Test nodes that are now online: [:"test-my-application1@127.0.0.1", :"test-my-application2@127.0.0.1"]
.
2019-08-14T13:52:35.918000 [error] GenServer MyApplication.EntitySupervisor2 terminating
** (stop) exited in: GenServer.stop(MyApplication.EntitySupervisor2.ProcessesSupervisor, :normal, :infinity)
** (EXIT) no process: the process is not alive or there's no process currently associated with the given name, possibly because its application isn't started
(elixir) lib/gen_server.ex:938: GenServer.stop/3
(horde) lib/horde/supervisor_impl.ex:571: Horde.SupervisorImpl.shut_down_all_processes/1
(horde) lib/horde/supervisor_impl.ex:347: Horde.SupervisorImpl.handle_info/2
(stdlib) gen_server.erl:637: :gen_server.try_dispatch/4
(stdlib) gen_server.erl:711: :gen_server.handle_msg/6
(stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3
Last message: {:crdt_update, [{:add, {:member_node_info, {MyApplication.EntitySupervisor2, :"manager@127.0.0.1"}}, %Horde.Supervisor.Member{name: {MyApplication.EntitySupervisor2, :"manager@127.0.0.1"}, status: :shutting_down}}]}
State: %Horde.SupervisorImpl{distribution_strategy: Horde.UniformQuorumDistribution, members: %{{MyApplication.EntitySupervisor2, :"manager@127.0.0.1"} => 1, {MyApplication.EntitySupervisor2, :"test-my-application1@127.0.0.1"} => 1, {MyApplication.EntitySupervisor2, :"test-my-application2@127.0.0.1"} => 1}, members_info: %{{MyApplication.EntitySupervisor2, :"manager@127.0.0.1"} => %Horde.Supervisor.Member{name: {MyApplication.EntitySupervisor2, :"manager@127.0.0.1"}, status: :alive}, {MyApplication.EntitySupervisor2, :"test-my-application1@127.0.0.1"} => %Horde.Supervisor.Member{name: {MyApplication.EntitySupervisor2, :"test-my-application1@127.0.0.1"}, status: :dead}, {MyApplication.EntitySupervisor2, :"test-my-application2@127.0.0.1"} => %Horde.Supervisor.Member{name: {MyApplication.EntitySupervisor2, :"test-my-application2@127.0.0.1"}, status: :dead}}, name: MyApplication.EntitySupervisor2, name_to_supervisor_ref: %{{MyApplication.EntitySupervisor2, :"manager@127.0.0.1"} => #Reference<0.3609365875.1531183106.30401>}, process_pid_to_id: %{#PID<20100.825.0> => 248425850269184959969685862485213231097, #PID<20100.829.0> => 128476685755369599155497243660738762471, #PID<0.866.0> => 248425850269184959969685862485213231097}, processes_by_id: %{128476685755369599155497243660738762471 => {{MyApplication.EntitySupervisor2, :"test-my-application1@127.0.0.1"}, %{id: 128476685755369599155497243660738762471, restart: :transient, shutdown: 10000, start: {MyApplication.IpScanner, :start_link, ["2.3.4.5"]}}, #PID<20100.829.0>}, 248425850269184959969685862485213231097 => {{MyApplication.EntitySupervisor2, :"manager@127.0.0.1"}, %{id: 248425850269184959969685862485213231097, restart: :transient, shutdown: 10000, start: {MyApplication.IpScanner, :start_link, ["2.3.4.5"]}}, #PID<0.866.0>}}, processes_updated_at: 0, processes_updated_counter: 0, shutting_down: true, supervisor_options: [members: [{MyApplication.EntitySupervisor2, :"manager@127.0.0.1"}], name: MyApplication.EntitySupervisor2, root_name: MyApplication.EntitySupervisor2, init_module: MyApplication.DistributedSupervisor, id: :ip_supervisor_2, strategy: :one_for_one, distribution_strategy: Horde.UniformQuorumDistribution, max_restarts: 1000, max_seconds: 1], supervisor_ref_to_name: %{#Reference<0.3609365875.1531183106.30401> => {MyApplication.EntitySupervisor2, :"manager@127.0.0.1"}}, waiting_for_quorum: []}
2019-08-14T13:52:35.918000 [info] Starting Horde.SupervisorImpl with name MyApplication.EntitySupervisor2
Finished in 7.9 seconds
2 tests, 0 failures
Randomized with seed 976897
Perhaps I'm doing something wrong so please correct me if I have.
I have 5 different registries/supervisors to split the load of my application. One for "everything else" and 4 for one particular entity which I've split up by hashing the name of the worker and starting the workers under that supervisor (and registering the name in that registry).
I'm using local-cluster with dynamic supervision (and the example nodewatching genserver) for testing where I see this the most, but it also happens when two machines connect together.
It doesn't say why the
ProcessSupervisor
shut down before this error occurred, so I assume it was a normal shutdown. Any ideas?