uskudnik / amazon-glacier-cmd-interface

Command line interface for Amazon Glacier
MIT License
375 stars 103 forks source link

Signature errors #111

Open wvmarle opened 11 years ago

wvmarle commented 11 years ago

This issue stems from #91, I think it's better to start a new thread.

In short: when attempting to do parallel uploads, the threads die one by one with a SignatureError. I'm trying to figure out where this problem originates, but haven't got far yet. It's getting late, so I'm calling it a night, hereby the status. Maybe someone can pick it up from here.

I have filtered bits and pieces of a test upload with eight sessions (guaranteed to trigger the issue), and here are what I think are the important snippets from my log.

Note: this uses a slightly changed version of boto to not log the payload (the 1 MB data chunks) in the log file, and also in glacier-cmd I have enabled boto's debug logging. That's normally switched off due to the payload logging... (details see discussion at https://github.com/uskudnik/amazon-glacier-cmd-interface/issues/90#issuecomment-9775460).

SHA256 hash of a part of a test archive: d9eee35c8b9a29e1590802e04948db300a2c601ce46f7a07797f1a5d6a0166b1 At 22:43:12 request is created correctly:

Nov 12 23:43:12 DEBUG    glacier-cmd CanonicalRequest:
PUT
/-/vaults/Test/multipart-uploads/a3I3N9Et0D0-nX_ZQJjeKqUCT_8ovXDytQkET3osBO86RCBb9VJpTIALi5pCNta_dZa2boO9jdfOYmg1MoD1xRpjVlB2

host:glacier.us-east-1.amazonaws.com
x-amz-content-sha256:d9eee35c8b9a29e1590802e04948db300a2c601ce46f7a07797f1a5d6a0166b1
x-amz-date:20121112T154312Z
x-amz-glacier-version:2012-06-01
x-amz-sha256-tree-hash:d9eee35c8b9a29e1590802e04948db300a2c601ce46f7a07797f1a5d6a0166b1

host;x-amz-content-sha256;x-amz-date;x-amz-glacier-version;x-amz-sha256-tree-hash
d9eee35c8b9a29e1590802e04948db300a2c601ce46f7a07797f1a5d6a0166b1

SHA256 and tree-hash are identical as it's a 1MB chunk. For larger chunks they will be different, and the hash at the end should be the complete payload hash (not the tree hash).

At 22:43:23 an unspecified error is encountered; boto correctly retries this one, and creates a new request. This request however uses a wrong hash for the payload (or maybe the whole payload is wrong this time?).

Nov 12 23:43:22 DEBUG    glacier-cmd encountered error exception, reconnecting
Nov 12 23:43:22 DEBUG    glacier-cmd establishing HTTPS connection: host=glacier.us-east-1.amazonaws.com, kwargs={}
Nov 12 23:43:23 DEBUG    glacier-cmd Token: NoneNov 12 23:43:23 DEBUG    glacier-cmd CanonicalRequest:
PUT
/-/vaults/Test/multipart-uploads/a3I3N9Et0D0-nX_ZQJjeKqUCT_8ovXDytQkET3osBO86RCBb9VJpTIALi5pCNta_dZa2boO9jdfOYmg1MoD1xRpjVlB2

host:glacier.us-east-1.amazonaws.com
x-amz-content-sha256:d9eee35c8b9a29e1590802e04948db300a2c601ce46f7a07797f1a5d6a0166b1
x-amz-date:20121112T154323Z
x-amz-glacier-version:2012-06-01
x-amz-sha256-tree-hash:d9eee35c8b9a29e1590802e04948db300a2c601ce46f7a07797f1a5d6a0166b1

host;x-amz-content-sha256;x-amz-date;x-amz-glacier-version;x-amz-sha256-tree-hash
3a40d65ddc1dd592b3144a9a529723d38aefb87171a25eb8e4ff0acc34de39a5

Not very surprisingly, this results in an error a little later:

Nov 12 23:44:02 ERROR    glacier-cmd ERROR: Expected 204, got (403, code=InvalidSignatureException, message=The request signature we calculated does not match the signature you provided. Check your AWS Secret Access Key and signing method. Consult the service documentation for details.

The Canonical String for this request should have been
'PUT
/-/vaults/Test/multipart-uploads/a3I3N9Et0D0-nX_ZQJjeKqUCT_8ovXDytQkET3osBO86RCBb9VJpTIALi5pCNta_dZa2boO9jdfOYmg1MoD1xRpjVlB2

host:glacier.us-east-1.amazonaws.com
x-amz-content-sha256:d9eee35c8b9a29e1590802e04948db300a2c601ce46f7a07797f1a5d6a0166b1
x-amz-date:20121112T154323Z
x-amz-glacier-version:2012-06-01
x-amz-sha256-tree-hash:d9eee35c8b9a29e1590802e04948db300a2c601ce46f7a07797f1a5d6a0166b1

host;x-amz-content-sha256;x-amz-date;x-amz-glacier-version;x-amz-sha256-tree-hash
d9eee35c8b9a29e1590802e04948db300a2c601ce46f7a07797f1a5d6a0166b1'

The String-to-Sign should have been
'AWS4-HMAC-SHA256
20121112T154323Z
20121112/us-east-1/glacier/aws4_request
081bc1d5aa3f8587ecd7f986548defadeafe58fc96f9f3d0c30a13ee1327758c

The key issue is of course: why is the payload hash different from the tree hash and the sha256 hash, when regenarating the request? Note: it is not simply the hash of another block of the file, I checked that. It's a totally new one.

I expect this payload hash is recalculated by boto; for all other requests (e.g. create vault) it has to be calculated too. So likely they simply calculate it.

If so, the next question: why has the payload changed? This is odd, of course.

This is either a bug of Boto (losing part or all of the payload when retrying? threads should be fully separated though) or an issue related to mmap where the mapping is inconsistent for some reason (very unlikely).

Yet Amazon's reply suggests that the data itself is actually sent correctly. They calculate the hashes themselves, too, and the hash values that are returned by Amazon match the local values. I assume at least this are hashes that Amazon calculated using the received data.

wvmarle commented 11 years ago

Doing some more work tonight, I finally found out it's directly related to mmap.

When creating the upload signature, the first time it's all calculated correctly. For whatever reason often there is some http error, boto catches these errors and automatically retries them (function _mexe in connections.py). However upon retry, the newly calculated hash value (I had this log as well, calculated using hashlib.sha256(payload).hexdigest()) is incorrect - somehow the payload data is changing somewhere on the way? It shouldn't be, as the payload is simply an mmap.mmap() object, and it is simply passed on as such.

However if I calculate the hash in _mexe the moment the retry is done (around line 861) I actually get the correct hash. Then the new connection is created, and when this connection is authenticated, boto.auth.add_auth() (around line 435 onwards) returns a CanonicalRequest with a different, incorrect hash value for the payload. As if the mmap suddenly starts to point to a different part of the file?!

Having no idea what could possibly be the cause, I decided to remove mmap and read the part to a string, in memory. Using tiny 1 MB chunks that's no problem of course, even for 8 session. Result: it worked, flawlessly. I found two reconnection messages in the log, and there the new payload hash was correct. So that proves to me that the issue lies with mmap, somehow. But how?

offlinehacker commented 11 years ago

Nice findings, i don't know why either :)

Mmap has to be considered as experimental for people not having enough memory. For example if you remove your drive or error with reading occurs your app crashes with Segmentation fault ;)

So you have to support both. I will have new implementation based on your code finished this week,

On Tue, Nov 13, 2012 at 4:44 PM, wvmarle notifications@github.com wrote:

Doing some more work tonight, I finally found out it's directly related to mmap.

When creating the upload signature, the first time it's all calculated correctly. For whatever reason often there is some http error, boto catches these errors and automatically retries them (function _mexe in connections.py). However upon retry, the newly calculated hash value (I had this log as well, calculated using hashlib.sha256(payload).hexdigest()) is incorrect - somehow the payload data is changing somewhere on the way? It shouldn't be, as the payload is simply an mmap.mmap() object, and it is simply passed on as such.

However if I calculate the hash in _mexe the moment the retry is done (around line 861) I actually get the correct hash. Then the new connection is created, and when this connection is authenticated, boto.auth.add_auth() (around line 435 onwards) returns a CanonicalRequestwith a different, incorrect hash value for the payload. As if the mmap suddenly starts to point to a different part of the file?!

Having no idea what could possibly be the cause, I decided to remove mmap and read the part to a string, in memory. Using tiny 1 MB chunks that's no problem of course, even for 8 session. Result: it worked, flawlessly. I found two reconnection messages in the log, and there the new payload hash was correct. So that proves to me that the issue lies with mmap, somehow. But how?

— Reply to this email directly or view it on GitHubhttps://github.com/uskudnik/amazon-glacier-cmd-interface/issues/111#issuecomment-10330691.

wvmarle commented 11 years ago

Cool.

I have discovered another thing, not mentioned: when calculating the hash, boto has a separate implementation for file-like objects, including our mmapped file. An mmap object can be used both as string, and as file. So when you give the part as string object, it uses the standard shalib.sha256() implementation, if it's given as mmap object it uses their own implementation.

The comments mention that this is to prevent loading the complete file in memory. Which is neat of course, but somehow this is also what I suspect is messing up on retry. At least the result coming from that function is wrong on retry. When I tried and calculated the hash with shalib.sha256() as well, I did see the correct hash appear in the logs, indicating that the actual payload is still the same but there is an issue with the recalculation. I haven't looked at this file-hash function yet.

Yesterday in the end I tried disabling that alternative implementation and have the mmapped part handled directly by shalib.sha256() but I gave up after I started getting heaps of time-out errors from Amazon and my testing branch that I use for these experiments doesn't catch those. Maybe tonight trying again, see what happens there.

wvmarle commented 11 years ago

Got it. That was a tricky one!

The issue: we're giving boto an mmap object, while boto is not written with that in mind. They fully expect a string, and while mmap of course can be addressed as a regular string, it also has file-like capabilities.

When the part is uploaded the first time around, the file seek pointer is at the beginning of the part, position 0.

When the part is retried, the file seek pointer is at anything but 0 (I haven't figured out how that happens, just that it happens - it seems part of the part is read, then the http connection fails, and as a result the pointer is somewhere between 0 and part_size). Of course this messes up the calculation of the hash in boto.utils.compute_hash (line 853-877) where the file is read starting from the current position of the seek pointer, so not the complete part is read, and as a result of course the SHA256 hash is wrong.

This will have to be fixed in the boto.connection._mexe (boto/connection.py line 776 onwards), by resetting the seek pointer to 0. I am thinking something like:

if hasattr(request.body, 'seek'):
    request.body.seek(0)

(inserted in connection.py line 861)

We should certainly continue to use mmap objects, as otherwise each part has to be completely read into memory. Also for calculating the hashes we should really consider calling the boto function for that, as now using hashlib.sha256() the complete part is still read in memory just to calculate the hash. The same accounts of course for the treehash function, though that's limited to 1 MB at a time.

But why waste memory when we can do without? If the memory is there, the OS will cache the file in memory for speedy access. If it's not there, it all takes longer as the file has to be read from disk time and again, but at least it still works.

So next steps: fork boto, create fix, wait for it to be accepted.

uskudnik commented 11 years ago

One word: Awesome! :D

Great work tracking this one down!

On Nov 14, 2012, at 16:22 , wvmarle notifications@github.com wrote:

Got it. That was a tricky one!

The issue: we're giving boto an mmap object, while boto is not written with that in mind. They fully expect a string, and while mmap of course can be addressed as a regular string, it also has file-like capabilities.

When the part is uploaded the first time around, the file seek pointer is at the beginning of the part, position 0.

When the part is retried, the file seek pointer is at anything but 0 (I haven't figured out how that happens, just that it happens - it seems part of the part is read, then the http connection fails, and as a result the pointer is somewhere between 0 and part_size). Of course this messes up the calculation of the hash in boto.utils.compute_hash (line 853-877) where the file is read starting from the current position of the seek pointer, so not the complete part is read, and as a result of course the SHA256 hash is wrong.

This will have to be fixed in the boto.connection._mexe (boto/connection.py line 776 onwards), by resetting the seek pointer to 0. I am thinking something like:

if hasattr(request.body, 'seek'): request.body.seek(0) (inserted in connection.py line 861)

We should certainly continue to use mmap objects, as otherwise each part has to be completely read into memory. Also for calculating the hashes we should really consider calling the boto function for that, as now using hashlib.sha256() the complete part is still read in memory just to calculate the hash. The same accounts of course for the treehash function, though that's limited to 1 MB at a time.

But why waste memory when we can do without? If the memory is there, the OS will cache the file in memory for speedy access. If it's not there, it all takes longer as the file has to be read from disk time and again, but at least it still works.

So next steps: fork boto, create fix, wait for it to be accepted.

— Reply to this email directly or view it on GitHub.

wvmarle commented 11 years ago

Well...

I was too fast there...

Now I don't get these response errors all the time (at least the hash gets calculated correctly), I still get such errors, and not a single part manages to get uploaded completely!

Mmap is supposed to be thread-safe, so reading different parts from the same file should also be a no-brainer. However, it is.

So back to the drawing boards... Damn!

A possible solution could be to not mmap several parts of the same file at once, but instead to create a temporary file containing the part to upload, mmapping that temporary file, and uploading those parts.

Also boto just says encountered error exception, reconnecting on http errors, which is not very helpful...

wvmarle commented 11 years ago

The errors are [Errno 32] Broken pipe exception with the occasional [Errno 104] Connection reset by peer exception. This after fixing boto's error logging.

Somehow the reading of the mmapped file doesn't work as expected - hence the suggestion of creating tmp files.

Created some minor fixes for boto and sent them a pull request for them; hope they will be accepted and incorporated soon. It'll also mean that we'll have to start using boto's development branch again! At least until next release.

offlinehacker commented 11 years ago

Wow i never expected there would be such problems with mmaping in python, but i think mmaping should be still considered experimental. At the same time it would be awesome if you find out what causes problems.

offlinehacker commented 11 years ago

What you need to do is to open new file for every process you make, using multiprocessing, then it must work! This is the only correct way to implement this and also the way i did, will report you my findings when code will be complete. Don't use same file descriptor for multiple mmap-s, glacier does not need threads/processes dependant on each other(this only makes problems), take a look at my code to see how it's supposed to be done.

wvmarle commented 11 years ago

This is the first ever foray of mine in multiprocessing. So I don't know very well how to handle files and so, it's a bit of trial and error, and then if it breaks knowing where to look! Will try to change this, see what happens.

The issues that I found appear to be not mmap errors as such, just wrongly handling things. Boto is clearly designed with just strings in mind, so when you start feeding it file-like stuff things may very well break, like the seek pointer that's not at the start of the file after a failed upload attempt.

Then I also looked at your code. Interesting work, very different from my approach.

Lots of error checking you do, I noticed. Messages will need to be improved though: too much developer-language, too hard to understand for a user what's actually going on, and what they did wrong.

There is no exception loglevel. That should be self.logger.error or self.logger.critical.

GlacierWrapper: Line 1036: error needs to be raised. Line 1032, 1033, 1042, 1043, and some more similar statements later: what's that supposed to do? I haven't seen such statements before.

And for retrying: Boto retries a lot already. The two errors that I ran into (broken pipes and connection resets) were caught by Boto and retried. The only retryable error that I have seen coming back to our script is a ResponseException due to a connection time-out. Everything else that can be retried, is retried already. You just don't see much of that in the logs as I was forced to switch off debug logging for Boto due to them logging the complete payload...

offlinehacker commented 11 years ago

Hi,

Well you can handle errors, check error codes and re-raise them if needed to be handled by application and not the library.

GlacierWrapper is only the library, it must not do any output to console, except of logging. Also if you are uploading something you Must operate asynchorniously and make callbacks from library to application for status reports. That's how it works but you can't see callbacks yet in my code. At the same time thanks to you, with your code and discoveries we will finnaly be able to make something called stable.

As far as i have seen boto, they haven't tought too mouch when implementing glacier support. This errors with string based buffers just should not exist. At the same time their design for glacier is bad or i just don't understand how they planned it.

There's also an option that we fork boto and reimplement glacier the correct way, but i doubt if i have so much time.

Logging.exception does exist http://docs.python.org/2/library/logging.html#logging.Logger.exception On Nov 15, 2012 3:01 AM, "wvmarle" notifications@github.com wrote:

This is the first ever foray of mine in multiprocessing. So I don't know very well how to handle files and so, it's a bit of trial and error, and then if it breaks knowing where to look! Will try to change this, see what happens.

The issues that I found appear to be not mmap errors as such, just wrongly handling things. Boto is clearly designed with just strings in mind, so when you start feeding it file-like stuff things may very well break, like the seek pointer that's not at the start of the file after a failed upload attempt.

Then I also looked at your code. Interesting work, very different from my approach.

Lots of error checking you do, I noticed. Messages will need to be improved though: too much developer-language, too hard to understand for a user what's actually going on, and what they did wrong.

There is no exception loglevel. That should be self.logger.error or self.logger.critical.

GlacierWrapper: Line 1036: error needs to be raised. Line 1032, 1033, 1042, 1043, and some more similar statements later: what's that supposed to do? I haven't seen such statements before.

And for retrying: Boto retries a lot already. The two errors that I ran into (broken pipes and connection resets) were caught by Boto and retried. The only retryable error that I have seen coming back to our script is a ResponseException due to a connection time-out. Everything else that can be retried, is retried already. You just don't see much of that in the logs as I was forced to switch off debug logging for Boto due to them logging the complete payload...

— Reply to this email directly or view it on GitHubhttps://github.com/uskudnik/amazon-glacier-cmd-interface/issues/111#issuecomment-10394726.

wvmarle commented 11 years ago

As far as i have seen boto, they haven't tought too mouch when implementing
glacier support. This errors with string based buffers just should not exist.

Should not exist - of course - but in their defence it's a rather obscure one, and only triggered by the combination of httplib reading part of the file and then crashing, followed by their rather smart implementation of taking a hash from a file without reading the whole thing in memory.

On the other hand it's surprising no-one else ran into it, as it seems this routine _mexe is used by all their upload functions. So also uploading to S3, for example. And as soon as you give it an mmap object and you have some http error, the retry will end in failure.

At the same time their design for glacier is bad or i just don't understand how they planned it.

They probably didn't plan much as it was written in weeks since the launch of Glacier.

There's also an option that we fork boto and reimplement glacier the correct way, but i doubt if i have so much time.

Well actually you don't need that much time. Forking isn't that bad.

First of all: take the original code from glaciercorecalls, most direct calls to Amazon Glacier are there already. Boto is doing the exact same on those calls.

The only thing that we really don't have is the authentication bit, and the main files for that are boto/auth.py and boto/connection.py.

And the part where they create a pool of 1,024 http connections to use while talking to Amazon (I have no idea why they're doing it like that, it seems overly complex to me, but it's not just for Glacier, those routines are used for all their connections).

Reading their code is hard, really hard. It sometimes even uses self as argument to functions in the same object making, it calls functions left, right and centre, returns functions from other objects/files, and well I'd almost call it "spaghetti code". Very hard to understand what's going on.

Logging.exception does exist http://docs.python.org/2/library/logging.html#logging.Logger.exception

Ah right. I was looking at the level names only. This is an extra one, didn't see that first.

SitronNO commented 11 years ago

Just a simple question, and not a technical one: I am getting these errors regularly, but nothing stops/breaks and eventually the file is uploaded. But is the file complete at Amazon?

Doing a retrieval of the inventory gives me the same checksum as I have locally, but is the checksum given by me when uploading, or did Amazon calculate it itself when the file was completely uploaded?

The reason for my worry is this, from the log:

Feb 18 15:45:09 DEBUG    glacier-cmd Got to work on range 2415919104-2550136832
Feb 18 15:45:09 DEBUG    glacier-cmd Wrote 2.2 GB of 16.4 GB (ESC[1m13ESC[0m%). Average rate 438.29 KB/s, eta 01:10:34.
Feb 18 15:45:10 DEBUG    glacier-cmd Starting upload of part 2415919104-2550136831.
Feb 18 15:50:03 DEBUG    glacier-cmd Finished uploading part 2415919104-2550136831.
Feb 18 15:50:03 DEBUG    glacier-cmd Got to work on range 2550136832-2684354560
Feb 18 15:50:04 DEBUG    glacier-cmd Wrote 2.4 GB of 16.4 GB (ESC[1m14ESC[0m%). Average rate 438.66 KB/s, eta 01:10:02.
Feb 18 15:50:04 DEBUG    glacier-cmd Starting upload of part 2550136832-2684354559.
Feb 18 15:54:52 ERROR    glacier-cmd ERROR: Expected 204, got (403, code=InvalidSignatureException, message=The request signature we calculated does not match the signature you provided. Check your AWS Secret Access Key and signing method. Consult the service documentation for details.

The Canonical String for this request should have been
'PUT
/-/vaults/Pictures/multipart-uploads/w_pVABjMJoRN-HCHW_nu9uzqiI6n6vCclFuwmUgLlNNy_x4Gw1-mQRW10CHfGlCHEx1AWSFY2rA74oYQrzvavAv4jU1z

host:glacier.eu-west-1.amazonaws.com
x-amz-content-sha256:3b09de7837d087912691d55a00cfe060c6549ab44aa84597dce9eff78ee2a696
x-amz-date:20130218T145120Z
x-amz-glacier-version:2012-06-01
x-amz-sha256-tree-hash:fb4821dc82d33cac293c3d65c928ad4f764a5a42efd3591e4221a4805b3724e3

host;x-amz-content-sha256;x-amz-date;x-amz-glacier-version;x-amz-sha256-tree-hash
3b09de7837d087912691d55a00cfe060c6549ab44aa84597dce9eff78ee2a696'

The String-to-Sign should have been
'AWS4-HMAC-SHA256
20130218T145120Z
20130218/eu-west-1/glacier/aws4_request
63e1c3501ea8359f573aeeedc4b553fe506511f7e6174dad93485356258caec5'
)
Feb 18 15:54:52 INFO     glacier-cmd str: Error while uploading data to Amazon Glacier.
Caused by: 1 exception
  UnexpectedHTTPResponseError: Expected 204, got (403, code=InvalidSignatureException, message=The request signature we calculated does not match the signature you provided. Check your AWS Secret Access Key and signing method. Consult the service documentation for details.

  The Canonical String for this request should have been
  'PUT
  /-/vaults/Pictures/multipart-uploads/w_pVABjMJoRN-HCHW_nu9uzqiI6n6vCclFuwmUgLlNNy_x4Gw1-mQRW10CHfGlCHEx1AWSFY2rA74oYQrzvavAv4jU1z

  host:glacier.eu-west-1.amazonaws.com
  x-amz-content-sha256:3b09de7837d087912691d55a00cfe060c6549ab44aa84597dce9eff78ee2a696
  x-amz-date:20130218T145120Z
  x-amz-glacier-version:2012-06-01
  x-amz-sha256-tree-hash:fb4821dc82d33cac293c3d65c928ad4f764a5a42efd3591e4221a4805b3724e3

  host;x-amz-content-sha256;x-amz-date;x-amz-glacier-version;x-amz-sha256-tree-hash
  3b09de7837d087912691d55a00cfe060c6549ab44aa84597dce9eff78ee2a696'

  The String-to-Sign should have been
  'AWS4-HMAC-SHA256
  20130218T145120Z
  20130218/eu-west-1/glacier/aws4_request
  63e1c3501ea8359f573aeeedc4b553fe506511f7e6174dad93485356258caec5'
  )

Feb 18 15:54:52 DEBUG    glacier-cmd Traceback (most recent call last):
  File "/usr/local/bin/glacier-cmd", line 9, in <module>
    load_entry_point('glacier==0.2dev', 'console_scripts', 'glacier-cmd')()
  File "/usr/local/lib/python2.6/dist-packages/glacier-0.2dev-py2.6.egg/glacier/glacier.py", line 811, in main
    args.func(args)
  File "/usr/local/lib/python2.6/dist-packages/glacier-0.2dev-py2.6.egg/glacier/glacier.py", line 147, in wrapper
    return fn(*args, **kwargs)
  File "/usr/local/lib/python2.6/dist-packages/glacier-0.2dev-py2.6.egg/glacier/glacier.py", line 302, in upload
    args.resume, args.sessions)
  File "/usr/local/lib/python2.6/dist-packages/glacier-0.2dev-py2.6.egg/glacier/GlacierWrapper.py", line 68, in wrapper
    ret = fn(*args, **kwargs)
  File "/usr/local/lib/python2.6/dist-packages/glacier-0.2dev-py2.6.egg/glacier/GlacierWrapper.py", line 210, in glacier_connect_wrap
    return func(*args, **kwargs)
  File "/usr/local/lib/python2.6/dist-packages/glacier-0.2dev-py2.6.egg/glacier/GlacierWrapper.py", line 68, in wrapper
    ret = fn(*args, **kwargs)
  File "/usr/local/lib/python2.6/dist-packages/glacier-0.2dev-py2.6.egg/glacier/GlacierWrapper.py", line 263, in sdb_connect_wrap
    return func(*args, **kwargs)
  File "/usr/local/lib/python2.6/dist-packages/glacier-0.2dev-py2.6.egg/glacier/GlacierWrapper.py", line 68, in wrapper
    ret = fn(*args, **kwargs)
  File "/usr/local/lib/python2.6/dist-packages/glacier-0.2dev-py2.6.egg/glacier/GlacierWrapper.py", line 1247, in upload
    p.start()
  File "/usr/lib/python2.6/multiprocessing/process.py", line 104, in start
    self._popen = Popen(self)
  File "/usr/lib/python2.6/multiprocessing/forking.py", line 99, in __init__
    code = process_obj._bootstrap()
  File "/usr/lib/python2.6/multiprocessing/process.py", line 232, in _bootstrap
    self.run()
  File "/usr/lib/python2.6/multiprocessing/process.py", line 88, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python2.6/dist-packages/glacier-0.2dev-py2.6.egg/glacier/glaciercorecalls.py", line 112, in upload_part_process
    writer.write(part, start=start)
  File "/usr/local/lib/python2.6/dist-packages/glacier-0.2dev-py2.6.egg/glacier/glaciercorecalls.py", line 194, in write
    code=e.code)

Feb 18 15:54:52 DEBUG    glacier-cmd Unknown error code: InvalidSignatureException.
Feb 18 15:54:53 DEBUG    glacier-cmd Connecting to Amazon Glacier for worker process with
    aws_access_key A***********A
    aws_secret_key 7**********************u
    region eu-west-1
Feb 18 15:54:53 DEBUG    glacier-cmd Got to work on range 2684354560-2818572288
Feb 18 15:54:55 DEBUG    glacier-cmd Starting upload of part 2684354560-2818572287.
Feb 18 15:59:35 DEBUG    glacier-cmd Finished uploading part 2684354560-2818572287.
Feb 18 15:59:35 DEBUG    glacier-cmd Got to work on range 2818572288-2952790016
Feb 18 15:59:35 DEBUG    glacier-cmd Wrote 2.5 GB of 16.4 GB (ESC[1m15ESC[0m%). Average rate 419.50 KB/s, eta 01:39:55.

Or to short it down:

Feb 18 15:50:04 DEBUG    glacier-cmd Starting upload of part 2550136832-2684354559.
Feb 18 15:54:52 ERROR    glacier-cmd ERROR: Expected 204, got (403, code=InvalidSignatureException, message=The request signature we calculated does not match the signature you provided. Check your AWS Secret Access Key and signing method. Consult the service documentation for details.
Feb 18 15:54:53 DEBUG    glacier-cmd Got to work on range 2684354560-2818572288
Feb 18 15:54:55 DEBUG    glacier-cmd Starting upload of part 2684354560-2818572287.
Feb 18 15:59:35 DEBUG    glacier-cmd Finished uploading part 2684354560-2818572287.

As you can see, the logs never mentions to finish uploading part 2550136832-2684354559. Was it uploaded or not?