uskudnik / amazon-glacier-cmd-interface

Command line interface for Amazon Glacier
MIT License
374 stars 100 forks source link

Tree hash, Amazon style #53

Closed wvmarle closed 11 years ago

wvmarle commented 11 years ago

Hash returned to the user at the end of an upload is the hash returned by Amazon for the complete file, if I understand the code correctly. No way to get the tree hash of the local file. So I hacked together a little script to calculate this. Standalone for now; not sure where to incorporate it and actually plan to stick to work on the glacier_lib as the intention is to move everything there asap. This script takes a file name on the command line, reads it 1MB chunk at a time, calculates the hash, and in the end calculates the tree hash.

#!/usr/bin/env python

from glaciercorecalls import tree_hash
from glaciercorecalls import chunk_hashes
from glaciercorecalls import bytes_to_hex
import sys
import math
import os.path

def main():

    # Calculate hash, Amazon style.
    filename = sys.argv[1]
    chunk = 1024*1024
    chunk_count = int(math.ceil(os.path.getsize(filename)/float(chunk)))

    f = file(filename, 'r')
    tree_hashes = [chunk_hashes(f.read(chunk))[0] for i in range(chunk_count)]
    print 'hash of %s: %s'% (filename, bytes_to_hex(tree_hash(tree_hashes)))

if __name__ == "__main__":
    sys.exit(main())
wvmarle commented 11 years ago

This one is now in the main branch; closing.