uskudnik / amazon-glacier-cmd-interface

Command line interface for Amazon Glacier
MIT License
374 stars 100 forks source link

Inventory inconsistency #44

Closed wvmarle closed 11 years ago

wvmarle commented 11 years ago

This is probably me not understanding Glacier enough, and if so definitely something for the documentation here.

Yesterday (some 16 hours ago now) I uploaded a 8.6GB file to Glacier, and using glacier-cmd deleted a few other archives that were dumped there as test uploads. My current glacier console shows:

Vault details as of the last inventory update: Size: 8.50 GiB

of Archives: 5

This sounds about right, other than that the total file size is less than I expect: it should be about 8.7-8.8 GB.

But when I'm doing a glacier-cmd inventory I get rather different results: a total of 8 files, some 170 MB, and it still shows files that I'm sure I deleted yesterday.

A second interesting thing to note: yesterday I also did some inventory calls, and hours later I started to get SNS e-mail telling me the inventory request was successful.

glacier-cmd listjobs gives me a list of several InventoryRetrieval jobs that are all Succeeded.

gburca commented 11 years ago

Spend some time with the Amazon Glacier docs. The way the service works is somewhat confusing at first. glacier-cmd inventory either requests a new inventory list, or retrieves one that's available for download (the way this command works could be improved). glacier-cmd describevault will tell you (LastInventory column) when the inventory was last updated on the AWS side. If your inventory request succeeded before the LastInventory time, you're retrieving an old inventory. Look at the --force option to force a new inventory job in that case. Any vault changes you made after LastInventory will not be shown until AWS updates the inventory, and you have no control over that, you just have to wait.

To put it a little differently, your vault contents are inventoried roughly once a day by AWS. Creating an inventory job does not cause AWS to update its inventory. It just requests AWS to provide you the results of its last inventory.

wvmarle commented 11 years ago

I know the inventory is updated once a day, but I did not know about being able to request an inventory update. This update takes hours (wondering why - I expect this is a database on Amazon's side, and not that they have to start reading tapes or so). Indeed how glacier-cmd works must be improved seriously. When requesting a new job, it returns to the command line without a single message - as if it failed.