benoitc / hackney

simple HTTP client in Erlang
Other
1.34k stars 427 forks source link

Match error in `hackney_pool:queue_out` #594

Closed binaryseed closed 4 years ago

binaryseed commented 5 years ago

I'm seeing a match error for somewhat sporadically... (Elixir stacktrace):

22:06:40.883 [error] Process #PID<0.24064.0> terminating
** (exit) exited in: :gen_server.call(#PID<0.4955.0>, {:checkin, #Reference<0.3745001004.2022440963.223114>, {'staging-user-service.something.net', 443, :hackney_ssl}, {:sslsocket, {:gen_tcp, #Port<0.389>, :tls_connection, :undefined}, [#PID<0.24304.0>, #PID<0.24274.0>]}, :hackney_ssl}, :infinity)
    ** (EXIT) an exception was raised:
        ** (MatchError) no match of right hand side value: {:empty, {[], []}}
            (hackney) /app/deps/hackney/src/hackney_pool.erl:509: :hackney_pool.queue_out/2
            (hackney) /app/deps/hackney/src/hackney_pool.erl:376: :hackney_pool.dequeue/3
            (hackney) /app/deps/hackney/src/hackney_pool.erl:349: :hackney_pool.handle_info/2
            (stdlib) gen_server.erl:637: :gen_server.try_dispatch/4
            (stdlib) gen_server.erl:711: :gen_server.handle_msg/6
            (stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3
    (stdlib) gen_server.erl:223: :gen_server.call/3
    (hackney) /app/deps/hackney/src/hackney_response.erl:298: :hackney_response.end_stream_body/2
    (hackney) /app/deps/hackney/src/hackney_response.erl:155: :hackney_response.stream_body1/2
    (hackney) /app/deps/hackney/src/hackney_response.erl:309: :hackney_response.read_body/3
    (hackney) /app/deps/hackney/src/hackney.erl:518: anonymous fn/2 in :hackney.body/2
    (httpoison) lib/httpoison/base.ex:809: HTTPoison.Base.request/6
    (http_services) lib/services/user_service_api.ex:84: UserService.API.request/1
    (cache) lib/cache.ex:43: Cache.fetch/4
Initial Call: anonymous fn/0 in Nerd.Resolvers.User.find_current_user/2
Ancestors: [Endpoint.TaskSupervisor, Endpoint.Supervisor, #PID<0.4610.0>]

https://github.com/benoitc/hackney/blob/master/src/hackney_pool.erl#L504-L517

It looks like the call to queue:out on L509 is returning {:empty, _queue} but it's always expecting there to be a value

Hackney 1.15.2 Erlang/OTP 22.1.3

Any ideas what could cause this situation?

binaryseed commented 5 years ago

@benoitc any debugging pointers?

benoitc commented 5 years ago

probably a race condition. i'm investigating

binaryseed commented 4 years ago

FYI, I was able to create a situation where the errors stopped happening by increasing max_connections from the default value to 200. Looking at the metrics emitted by hackney that makes sense because in_use_count now never hits 200 so we never have to queue in the first place

It also makes me think it might be easier to reproduce by lowering max_connections...

alinpopa commented 4 years ago

@benoitc any update on this? I've started to see this issue myself with the default pool in 1.15.1. Thanks.

benoitc commented 4 years ago

@alinpopa how/when does it happen? More generally there is a WIP on fixing connection reusing, i will provide a more detailled status tomorrow, should have been earlier but I've been side-tracked.

alinpopa commented 4 years ago

@benoitc it's a bit difficult to know exactly how it happened, but the only thing I can tell you (from my logs), is that at that time we had a high requests load (again, don't really have numbers to back me up on this), and then it crashed with this:

no case clause matching: {:EXIT, {{{:badmatch, {:empty, {[], []}}},
[{:hackney_pool, :queue_out, 2, [file: '/app/deps/hackney/src/hackney_pool.erl', line: 509]},
{:hackney_pool, :deliver_socket, 3, [file: '/app/deps/hackney/src/hackney_pool.erl', line: 524]},
{:hackney_pool, :handle_call, 3, [file: '/app/deps/hackney/src/hackney_pool.erl', line: 294]}, {:gen_server,
:try_handle_call, 4, [file: 'gen_server.erl', line: 661]}, {:gen_server, :handle_msg, 6, [file: 'gen_server.erl', line: 
690]}, {:proc_lib, :init_p_do_apply, 3, [file: 'proc_lib.erl', line: 249]}]}, {:gen_server, :call, [#PID<0.3296.0>,
{:checkout, {'.....', 443, :hackney_ssl}, #PID<0.30153.1917>,
#Reference<0.692853367.1751121921.124719>}, 8000]}}}

Hope this helps. Thanks.

benoitc commented 4 years ago

fixed in cd96ce79cee73ae8294b8b42249f6923f866a105