Open kostko opened 7 years ago
actually, remote server should not ever break, this must be some bug somewhere. do you mean Amazon servers or internal IPC communucations?
I am getting these errors consistently when uploading large files on a new system (Ubuntu 16.10). I did not encounter this before on Ubuntu 14.04, so I am not sure where the problem is (could be a change in some dependent module?).
Also not sure if this SIGPIPE comes from parent-child IPC or from the remote HTTP socket, I'll do some more debugging later, but the file upload seems to be working at the moment with this change (while previously it failed consistently after uploading at most 10 parts).
Perhaps also the parent should retry when a child exits due to SIGPIPE?
actually, remote server should not ever break
Why not? A TCP connection may reset at any time, although this may indicate some network/overload issues somwhere between me and AWS.
Without this PR, the child worker process will simply exit on SIGPIPE (e.g. when the TCP connection with the remote server breaks) and the parent will report something like this:
And then just terminate.