vsespb / mt-aws-glacier

Perl Multithreaded Multipart sync to Amazon Glacier
http://mt-aws.com/
GNU General Public License v3.0
536 stars 57 forks source link

Ignore SIGPIPE in child worker #127

Open kostko opened 7 years ago

kostko commented 7 years ago

Without this PR, the child worker process will simply exit on SIGPIPE (e.g. when the TCP connection with the remote server breaks) and the parent will report something like this:

EXIT on SIGCHLD (signal 13, exit_code 0)

And then just terminate.

vsespb commented 7 years ago

actually, remote server should not ever break, this must be some bug somewhere. do you mean Amazon servers or internal IPC communucations?

kostko commented 7 years ago

I am getting these errors consistently when uploading large files on a new system (Ubuntu 16.10). I did not encounter this before on Ubuntu 14.04, so I am not sure where the problem is (could be a change in some dependent module?).

Also not sure if this SIGPIPE comes from parent-child IPC or from the remote HTTP socket, I'll do some more debugging later, but the file upload seems to be working at the moment with this change (while previously it failed consistently after uploading at most 10 parts).

kostko commented 7 years ago

Perhaps also the parent should retry when a child exits due to SIGPIPE?

kostko commented 7 years ago

actually, remote server should not ever break

Why not? A TCP connection may reset at any time, although this may indicate some network/overload issues somwhere between me and AWS.