vsespb / mt-aws-glacier

Perl Multithreaded Multipart sync to Amazon Glacier
http://mt-aws.com/
GNU General Public License v3.0
536 stars 57 forks source link

400 Bad Request #125

Open mbevilacqua opened 7 years ago

mbevilacqua commented 7 years ago

Error: Hi, program seems to 'randomly' produce a bad request on sync. I say 'randomly' because this has happened several times but at different points in time while trying to sync a folder. Sometimes I got the error on the first 5 minutes and sometimes it was running for hours before failing. Running on OS X. Runniong on perl 5, version 18, subversion 2 (v5.18.2) built for darwin-thread-multi-2level.

mtglacier sync --new --replace-modified --delete-removed --exclude '.DS_Store' --config glacier.cfg --dir Glacier-Vault1 --journal journal-Vault1.csv --vault Vault1 --concurrency 5

===REQUEST: PUT http://glacier.eu-west-1.amazonaws.com/-/vaults/Vault1/multipart-uploads/05K96wor6WaP5nFGwzinXwdV_tdM2DsMsLUp5yevSl8lLpnxvBLKHMJTdlYODBhOQccIwE2idt7D0r7-Sdrk6lSEHJp5 Authorization: AWS4-HMAC-SHA256 Credential=REMOVED/20161215/eu-west-1/glacier/aws4_request, SignedHeaders=content-length;content-range;content-type;host;x-amz-content-sha256;x-amz-date;x-amz-glacier-version;x-amz-sha256-tree-hash, Signature=REMOVED Host: glacier.eu-west-1.amazonaws.com User-Agent: mt-aws-glacier/1.120 (http://mt-aws.com/) libwww-perl/6.05 Content-Length: 16777216 Content-Range: bytes 2248146944-2264924159/* Content-Type: application/octet-stream X-Amz-Content-Sha256: 0e8bfc8eb7fe943bdd8aeba43186d50714de2e158c6c4cef01268f5a02466fa1 X-Amz-Date: 20161215T025603Z X-Amz-Glacier-Version: 2012-06-01 X-Amz-Sha256-Tree-Hash: b519658af0684315101b1b65364f0525d8e1f6f0b548c7dd456340a1239bf55b

===RESPONSE: HTTP/1.1 400 Bad Request Connection: close Date: Thu, 15 Dec 16 02:57:18 GMT Content-Length: 0 Client-Date: Thu, 15 Dec 2016 02:57:20 GMT Client-Peer: 54.239.33.110:80 Client-Response-Num: 1 X-Amz-Id-2: njsm5ceRvGIwZI0mMu0e27OStXgFu5Hlyz8mJwcb0uhhTjGh4BBq4DgW3zxfMfyK0NQfgS/kPhSl4dCzsaowIn3fHNawshbq X-Amz-Request-Id: 34894E02AB28C14F

ERROR (child 14910): Unexpected reply from remote server

EXIT on SIGCHLD (exit_code 1)

vsespb commented 7 years ago

Hi. Yes, I experience same problem. I believe it's a problem on server side and reported it to Amazon but they seem to ignore it https://forums.aws.amazon.com/thread.jspa?threadID=239983&tstart=0

Could you pls post it to same thread on their forum? Maybe this will help.

Otherwise need to add special option to ignore http 400 (it should be special option as ignoring 400 error violates HTTP standard).

mbevilacqua commented 7 years ago

I would be very surprised if this was a server side problem, Glacier has been running for quite some time now. Given the random nature of it I suspect corruption at some level during transit maybe? Would suggest we retry the exact same request maybe up to 3 times before aborting and see if the problem persists?

On Thu, Dec 15, 2016 at 08:58 Victor Efimov notifications@github.com wrote:

Hi. Yes, I experience same problem. I believe it's a problem on server side and reported it to Amazon but they seem to ignore it https://forums.aws.amazon.com/thread.jspa?threadID=239983&tstart=0

Could you pls post it to same thread on their forum? Maybe this will help.

Otherwise need to add special option to ignore http 400 (it should be special option as ignoring 400 error violates HTTP standard).

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/vsespb/mt-aws-glacier/issues/125#issuecomment-267262530, or mute the thread https://github.com/notifications/unsubscribe-auth/AE_b9T3PN8hvplxxDBoTB7LHRmOtd1P0ks5rIPM0gaJpZM4LNz2n .

mbevilacqua commented 7 years ago

Checking on the Amazon forum for other 400 HTTP codes I notice that pretty much all of them have a very helpful message associated to the error code which we seem to be lacking on our error. Are we dumping everything from the HTTP response we get the 400 error on?

vsespb commented 7 years ago

I would be very surprised if this was a server side problem, Glacier has been running for quite some time now.

I disagree. Amazon and Glacier had bugs in the past, including stupid bugs and network bugs. No matter how many users they have

Besides this error first seen several months ago, and several years before there was no error and I did not modify code.

Given the random nature of it I suspect corruption at some level during transit maybe?

Data transfer made over SSL and there is no SSL error, so "corruption" must is inside SSL packet, which is improbable.

Are we dumping everything from the HTTP response we get the 400 error on?

Yes