Open matthewsecrist opened 2 years ago
I may be seeing this same issue with my application! Could it be related to a new elixir version? We recently upgraded to 1.13.4 otp 25.
I've been seeing the issue with 1.12 and 1.13 both, so I don't believe its related to the version.
I'd like to post this for anyone who kept seeing the :get_telemetry
warning in their logs and Googled it and came across this issue.
I ran into this when using a module based dynamic supervisor from the example in the docs. I had to add :name
to the options
in the start_link
function.
My start_link
ended up like this:
def start_link(init_arg, options \\ []),
do:
Horde.DynamicSupervisor.start_link(
__MODULE__,
init_arg,
options |> Keyword.put_new(:name, __MODULE__)
)
I've got a service that I'm building up that will accept requests to an endpoint and will push the payload to RabbitMQ, which will then be read by a Broadway pipeline. The pipeline will send the messages to GenServers based on the id in the payload. The GenServers will start one per id, handle processing the state per message and timeout if the GenServer doesn't receive any additional messages within a specific timespan.
The problem is, I'm encountering multiple issues once I incorporate Horde into this around timeouts and crashes, and eventually messages just stop processing entirely. I've read the issues https://github.com/derekkraan/horde/issues/233 and https://github.com/derekkraan/horde/issues/227 and they both seem related, but the issue seems to stem from adding or removing nodes from the cluster.
Some example code here (some of the logic is omitted)
alias MyApp.{HordeRegistry, HordeSupervisor}
def run(event) do name = Map.get(event, :id)
end
def start_link(opts) do {name, opts} = Keyword.pop(opts, :name)
end
@impl GenServer def init(_) do Process.flag(:trap_exit, true) {:ok, 0} end
@impl GenServer def handle_call({:event, event}, _from, state) do {:reply, :ok, state + event.count} end
@impl GenServer def handle_info(_message, state) do
logic omitted
end
@impl GenServer def terminate(_message, _state) do
logic omitted
end
def via_tuple(name), do: {:via, Horde.Registry, {HordeRegistry, name}} end
Exit while fetching metrics from MyApp.HordeSupervisor. Skip poll action. Reason: {:timeout, {GenServer, :call, [MyApp.HordeSupervisor, :get_telemetry, 5000]}}.
GenServer #PID<0.2061.0> terminating (stop) exited in: GenServer.call(MyApp.HordeSupervisor, {:set_members, [{MyApp.HordeSupervisor, :"my_app@0.0.0.0"}, {MyApp.HordeSupervisor, :"my_app@0.0.0.0"}, {MyApp.HordeSupervisor, :"my_app@0.0.0.0"}, {MyApp.HordeSupervisor, :"my_app@0.0.0.0"}, {MyApp.HordeSupervisor, :"my_app@0.0.0.0"}, {MyApp.HordeSupervisor, :"my_app@0.0.0.0"}, {MyApp.HordeSupervisor, :"my_app@0.0.0.0"}, {MyApp.HordeSupervisor, :"my_app@0.0.0.0"}, {MyApp.HordeSupervisor, :"my_app@0.0.0.0"}, {MyApp.HordeSupervisor, :"my_app@0.0.0.0"}, {MyApp.HordeSupervisor, :"my_app@0.0.0.0"}, {MyApp.HordeSupervisor, :"my_app@0.0.0.0"}, {MyApp.HordeSupervisor, :"my_app@0.0.0.0"}, {MyApp.HordeSupervisor, :"my_app@0.0.0.0"}, {MyApp.HordeSupervisor, :"my_app@0.0.0.0"}, {MyApp.HordeSupervisor, :"my_app@0.0.0.0"}, {MyApp.HordeSupervisor, :"my_app@0.0.0.0"}, {MyApp.HordeSupervisor, :"my_app@0.0.0.0"}, {MyApp.HordeSupervisor, :"my_app@0.0.0.0"}, {MyApp.HordeSupervisor, :"my_app@0.0.0.0"}, {MyApp.HordeSupervisor, :"my_app@0.0.0.0"}, {MyApp.HordeSupervisor, :"my_app@0.0.0.0"}, {MyApp.HordeSupervisor, :"my_app@0.0.0.0"}]}, 30000) (EXIT) time out (elixir 1.12.3) lib/gen_server.ex:1024: GenServer.call/3 (my_app 0.1.0) lib/my_app/node_observer.ex:32: MyApp.NodeObserver.set_members/1 (my_app 0.1.0) lib/my_app/node_observer.ex:23: MyApp.NodeObserver.handle_info/2 (stdlib 3.17.2.1) gen_server.erl:695: :gen_server.try_dispatch/4 (stdlib 3.17.2.1) gen_server.erl:771: :gen_server.handle_msg/6 (stdlib 3.17.2.1) proc_lib.erl:226: :proc_lib.init_p_do_apply/3 Last message: {:nodedown, :"my_app@0.0.0.0", [node_type: :visible]}
GenServer MyApp.HordeSupervisor terminating ** (MatchError) no match of right hand side value: {nil, #Reference<0.369051839.3555065857.132324>} (horde 0.8.7) lib/horde/dynamic_supervisor_impl.ex:233: Horde.DynamicSupervisorImpl.handle_cast/2 (stdlib 3.17.2.1) gen_server.erl:695: :gen_server.try_dispatch/4 (stdlib 3.17.2.1) gen_server.erl:771: :gen_server.handle_msg/6 (stdlib 3.17.2.1) proc_lib.erl:226: :proc_lib.init_p_do_apply/3 Last message: {:"$gen_cast", {:disown_child_process, 114388217379722495904587093633141899824}}
GenServer #PID<0.576.0> terminating (stop) exited in: GenServer.call(MyApp.HordeSupervisor, :horde_shutting_down, 5000) (EXIT) no process: the process is not alive or there's no process currently associated with the given name, possibly because its application isn't started (elixir 1.12.3) lib/gen_server.ex:1014: GenServer.call/3 (horde 0.8.7) lib/horde/signal_shutdown.ex:21: anonymous fn/1 in Horde.SignalShutdown.terminate/2 (elixir 1.12.3) lib/enum.ex:930: Enum."-each/2-lists^foreach/1-0-"/2 (stdlib 3.17.2.1) gen_server.erl:733: :gen_server.try_terminate/3 (stdlib 3.17.2.1) gen_server.erl:918: :gen_server.terminate/10 (stdlib 3.17.2.1) proc_lib.erl:226: :proc_lib.init_p_do_apply/3 Last message: {:EXIT, #PID<0.572.0>, :shutdown}