ninenines / gun

HTTP/1.1, HTTP/2, Websocket client (and more) for Erlang/OTP.
ISC License
891 stars 232 forks source link

Allow terminating normally when the owner process is gone #187

Closed essen closed 5 years ago

essen commented 5 years ago

Currently this produces an error report due to the Gun process' error reason. Sometimes this is not desirable. An option should be added to disable that behavior.

edescourtis commented 5 years ago

I see this issue in production sometimes (FYI) eventually results in errors like:

2019-02-03 23:56:13.523 [error] pid=<0.3036.0>  Child :gun of Supervisor :gun_sup shutdown abnormally
** (exit) shutdown
Pid: #PID<0.26491.1812>
Start Call: :gun.start_link()

=SUPERVISOR REPORT==== 3-Feb-2019::23:56:13 ===
     Supervisor: {local,gun_sup}
     Context:    shutdown_error
     Reason:     shutdown
     Offender:   [{pid,<0.31857.1812>},
                  {id,gun},
                  {mfargs,{gun,start_link,[]}},
                  {restart_type,temporary},
                  {shutdown,5000},
                  {child_type,worker}]
zinid commented 5 years ago

I also see the following meaningless supervisor reports:

=SUPERVISOR REPORT==== 25-Apr-2019::17:36:45.011836 ===
    supervisor: {local,gun_sup}
    errorContext: shutdown_error 
    reason: {shutdown,closed}
    offender: [{pid,<0.9731.0>},
               {id,gun},
               {mfargs,{gun,start_link,[]}},
               {restart_type,temporary},
               {shutdown,5000},
               {child_type,worker}]

and

=SUPERVISOR REPORT==== 25-Apr-2019::17:48:27.654224 ===
    supervisor: {local,gun_sup}
    errorContext: shutdown_error 
    reason: noproc
    offender: [{pid,<0.784.0>},
               {id,gun},
               {mfargs,{gun,start_link,[]}},
               {restart_type,temporary},
               {shutdown,5000},
               {child_type,worker}]

Those are just polluting the logs and confuse sysadmins.

essen commented 5 years ago

None of all those logs quoted in the ticket are relevant to the issue. The issue is about errors that will contain the owner_gone atom in the reason.

I assume that in all the logs posted here you were calling supervisor:terminate_child, otherwise it does not make much sense to me (this is what gun:close/1 does).

The {shutdown,closed} log seems to be caused by a missing clause in the supervisor module. It special cases shutdown but somehow not {shutdown, _} like other places in OTP. I'll send a PR for that.

The other log seems related, the "problem" is that Gun stops after the supervisor begins to stop the Gun process but before it actually sends the shutdown exit signal, and it considers all exit reasons in that case as error. I'm not sure much can be done in the general case. I'm also not sure why the Gun process exited to begin with? Need more info there.

zinid commented 5 years ago

None of all those logs quoted in the ticket are relevant to the issue.

Well, maybe, I just didn't delve into them and wasn't confident to open a new issue. Just saying.

essen commented 5 years ago

Please do delve into the last one and provide more info into a separate ticket because I can't really know where the issue comes from at the moment. If the Gun process stopped while you were trying to stop it, there's probably a log indicating why.

essen commented 5 years ago

Based on this comment I think both are expected: https://github.com/erlang/otp/blob/master/lib/stdlib/src/supervisor.erl#L844

You might want to just silence the shutdown_error supervisor reports.

essen commented 5 years ago

I got the original issue fixed. As for a better gun:close, I think gun:shutdown is it, it just needs to be implemented. #60