seung-lab / cloud-files

Threaded Python and CLI client library for AWS S3, Google Cloud Storage (GCS), in-memory, and the local filesystem.
BSD 3-Clause "New" or "Revised" License
38 stars 8 forks source link

feat(cli): adds verify command #65

Closed william-silversmith closed 3 years ago

william-silversmith commented 3 years ago

Tool to help check if transfers completed successfully.

Compare two directories list of files to see if the content length matches, then ETag, then Content-Md5. Performs comparison based on matching filenames. By default, returns an error if the directories don't match, but you can also force it to do a comparison of the intersection of the two directories.

Example:

cloudfiles verify gs://bucket/files s3://bucket/files --verbose

The filesystem doesn't store ETags or md5 hashes, so we add an option to compute it on the fly:

cloudfiles verify ./mydata gs://bucket/copy-of-my-data/ --md5

CloudFiles checks md5 during downloads and uploads, but this feature gives users more confidence when they can see everything matches.