ninenines / gun

HTTP/1.1, HTTP/2, Websocket client (and more) for Erlang/OTP.
ISC License
891 stars 232 forks source link

Unexpected event in state :connected of type :info: #193

Closed gaynetdinov closed 4 years ago

gaynetdinov commented 5 years ago

Hello,

sometimes I see the following error from my app in the logs

{#Reference<0.884867089.3395289089.120418>, :badarg} {:state, #PID<0.32267.3>,

#Reference<0.884867089.3348627458.55187>, 'myhost.com', 80, 'myhost.com', 80, [], %{connect_timeout: 600000, http2_opts: %{keepalive: :infinity}, http_opts: %{keepalive: :infinity}, retry: 10, retry_timeout: 100}, :undefined, #Port<0.1623615>, :gun_tcp, {:tcp, :tcp_closed, :tcp_error}, :gun_http, {:http_state, #PID<0.32267.3>, #Port<0.1623615>, :gun_tcp, :"HTTP/1.1", [:gun_data_h], :keepalive, "", [], :head, {0, 0}, :head, #Function<0.25310258/1 in :gun_http.init/4>}}

I see where it happens in the codebase, but I have hard times to track it down in order to figure out why it happens and how to prevent it. Could you please give me some hints?

Thank you!

essen commented 5 years ago

This is what I wrote when I investigated it for a customer last month:

About the {#Ref<0.1979334955.3890479110.82550>,badarg} message, I investigated as best as I could and my recommendation is to upgrade to a more recent OTP version as I can reproduce it with 20.1 but I can't reproduce it with 21.2.

So please try upgrading and if it still happens in 21.2 then I have no clue at this time.

gaynetdinov commented 5 years ago

It still happens on 21.3.

essen commented 5 years ago

I heard the same elsewhere earlier today. Something that could help would be a small snippet that reliably reproduces the problem for you.

zinid commented 5 years ago

Just for the record, I receive them a lot in my software too, especially during high load. I typically add a workaround like this one. This is clearly an Erlang bug and seems like version upgrades don't help. Ah, and I don't know how to reproduce this reliably: the message only appears during high load.

essen commented 5 years ago

Ports can send {Ref, badarg} as a response when something is wrong in various cases. Just gotta figure out which one I guess.

essen commented 5 years ago

Could either of you add a log in Gun to log when {Closed, Socket} or {Error, Socket, Reason} is received and log at least the owner_ref from the state, plus the error, I have a suspicion the issue could be that we are trying to close the port while it is closing and this may create this weird message. Just run it until the next {Ref, badarg} and see if you get a match.

essen commented 5 years ago

From 21.3.8:

  OTP-15295    Application(s): stdlib

               A bug in gen_statem has been fixed where the internal
               timeout message could arrive as an info to the callback
               module during high load due to incorrect use of
               asynchronous timer cancel.

I wonder if this could help.

pankajsoni19 commented 4 years ago

I am getting the same logs on my system. I am using shotgun master repo which is at gun 1.3.0 erlang version is > 22.0.7

2019-08-06 09:03:49 =ERROR REPORT====
Unexpected message: {#Ref<0.4027614940.3190030337.21497>,badarg}
essen commented 4 years ago

A workaround is to set the transport option {exit_on_close, false}. I am not sure whether it has any ill effects though. I will open an OTP ticket with my findings because the error seems to be either in the port_control BIF or the inet driver.

essen commented 4 years ago

The OTP ticket is: https://bugs.erlang.org/browse/ERL-1049

essen commented 4 years ago

There's already a PR at https://github.com/erlang/otp/pull/2397

Please try and provide feedback!

zinid commented 4 years ago

Thanks a lot for paying attention so closely to this problem! Sadly, I don't have enough time to check the PR because I'm at another project ATM.

essen commented 4 years ago

Since this is covered both in the OTP bug tracker and in an OTP PR, I'll be closing this. Thanks!

pankajsoni19 commented 4 years ago

Update: After updating to OTP 22.1.4 and using latest gun lib this issue does not seem to occur.

essen commented 4 years ago

Great!