cronokirby / alchemy

A discord library for Elixir
MIT License
152 stars 34 forks source link

Bot becomes unresponsive if cache ever goes offline #113

Open Cohedrin opened 3 years ago

Cohedrin commented 3 years ago

Issue

If the Alchemy.Cache.Guilds guild child process is ever killed for any reason (i.e. timeouts), the bot will become unresponsive and no commands or messages can be processed for that server.

Analysis:

Reproduction steps

For checking the state

children = Supervisor.which_children(Alchemy.Cache.Guilds.GuildSupervisor)
pids = children |> Enum.map(fn e -> Tuple.to_list(e) |> Enum.at(1) end)
has_been_restarted = Enum.any?(pids, fn pid ->
  state = :sys.get_state(pid)
  state["unavailable"] == true && state["id"] == guild_id
end)

if has_been_restarted is true, things are broken. get_state returns some more useful info (the state of the process), but for the purposes of determining that this is working that's all that's relevant.

Notes

I was attempting to submit a pr to fix this issue, but was having trouble determining what the proper way of fixing this would be.

It seems like we just need to refresh the "seed" state of the cache when this happens, but it wasn't clear to me where that should happen (or is currently happening from). I also was unsure if there was a hidden reason that we could not do this on genstate death.

Issues aside, wanted to say thanks for the awesome library! I was only able to debug this in couple hours because of the great work you've put into this so far to make this work so well.