Closed gaynetdinov closed 4 years ago
This is what I wrote when I investigated it for a customer last month:
About the {#Ref<0.1979334955.3890479110.82550>,badarg} message, I investigated as best as I could and my recommendation is to upgrade to a more recent OTP version as I can reproduce it with 20.1 but I can't reproduce it with 21.2.
So please try upgrading and if it still happens in 21.2 then I have no clue at this time.
It still happens on 21.3.
I heard the same elsewhere earlier today. Something that could help would be a small snippet that reliably reproduces the problem for you.
Just for the record, I receive them a lot in my software too, especially during high load. I typically add a workaround like this one. This is clearly an Erlang bug and seems like version upgrades don't help. Ah, and I don't know how to reproduce this reliably: the message only appears during high load.
Ports can send {Ref, badarg}
as a response when something is wrong in various cases. Just gotta figure out which one I guess.
Could either of you add a log in Gun to log when {Closed, Socket}
or {Error, Socket, Reason}
is received and log at least the owner_ref
from the state, plus the error, I have a suspicion the issue could be that we are trying to close the port while it is closing and this may create this weird message. Just run it until the next {Ref, badarg}
and see if you get a match.
From 21.3.8:
OTP-15295 Application(s): stdlib
A bug in gen_statem has been fixed where the internal
timeout message could arrive as an info to the callback
module during high load due to incorrect use of
asynchronous timer cancel.
I wonder if this could help.
I am getting the same logs on my system. I am using shotgun master repo which is at gun 1.3.0
erlang version is > 22.0.7
2019-08-06 09:03:49 =ERROR REPORT====
Unexpected message: {#Ref<0.4027614940.3190030337.21497>,badarg}
A workaround is to set the transport option {exit_on_close, false}
. I am not sure whether it has any ill effects though. I will open an OTP ticket with my findings because the error seems to be either in the port_control
BIF or the inet
driver.
The OTP ticket is: https://bugs.erlang.org/browse/ERL-1049
There's already a PR at https://github.com/erlang/otp/pull/2397
Please try and provide feedback!
Thanks a lot for paying attention so closely to this problem! Sadly, I don't have enough time to check the PR because I'm at another project ATM.
Since this is covered both in the OTP bug tracker and in an OTP PR, I'll be closing this. Thanks!
Update: After updating to OTP 22.1.4 and using latest gun lib this issue does not seem to occur.
Great!
Hello,
sometimes I see the following error from my app in the logs
I see where it happens in the codebase, but I have hard times to track it down in order to figure out why it happens and how to prevent it. Could you please give me some hints?
Thank you!