PSPDFKit-labs / bypass

Bypass provides a quick way to create a custom plug that can be put in place instead of an actual HTTP server to return prebaked responses to client requests.
https://hex.pm/packages/bypass
MIT License
964 stars 111 forks source link

Crash on shutdown in `dispatch_awaiting_callers` #120

Open digitalcora opened 2 years ago

digitalcora commented 2 years ago

I noticed a flaky test failure with the reason ** (exit) shutdown, and the following output logged:

** (stop) exited in: GenServer.stop(:normal, :normal, :infinity)
    ** (EXIT) no process: the process is not alive or there's no process currently associated with the given name, possibly because its application isn't started
    (elixir 1.12.1) lib/gen_server.ex:972: GenServer.stop/3
    (bypass 2.1.0) lib/bypass/instance.ex:385: Bypass.Instance.dispatch_awaiting_callers/1
    (bypass 2.1.0) lib/bypass/instance.ex:62: Bypass.Instance.handle_info/2
    (stdlib 3.14) gen_server.erl:689: :gen_server.try_dispatch/4
    (stdlib 3.14) gen_server.erl:765: :gen_server.handle_msg/6
    (stdlib 3.14) proc_lib.erl:226: :proc_lib.init_p_do_apply/3

Looking at the referenced code, it seems GenServer.stop/3 is not being called correctly: the first argument is supposed to be a reference to a server, but the Bypass code is GenServer.stop(:normal). So if there are any callers_awaiting_exit, this code will always crash.

zraul123 commented 1 year ago

Any updates on this? We're seeing the same behavior causing flaky tests. Lately, for some reason, it started failing fairly often.

firestack commented 10 months ago

We've been having this happen more frequently (almost every CI run, rarely when run locally) since we updated to elixir 1.15.

I've tried fixing this locally by changing the GenServer.stop/3 call to be GenServer.stop(__MODULE__, :normal) in a local checkout, but I'm still having the issue where the test fails with ** (exit) shutdown, so while this needs to be fixed, I'm not sure it's the root cause of the test failure.

grzuy commented 3 weeks ago

Cross linking with opened pull requests that are attempting to fix this issue (I think):