red-hat-storage / rhcephcompose

A tool to compose releases of Red Hat Ceph Storage for Ubuntu
MIT License
1 stars 1 forks source link

verify cached files' checksums #34

Closed ktdreyer closed 8 years ago

ktdreyer commented 8 years ago

Prior to this change, we would blindly trust the files in the cache were good to use without validating the cached files' contents. This could cause issues when rhcephcompose crashed and left the cache in an invalid state, or some network interruption or other disk issue caused our cache files to exist with invalid contents.

When parsing the data for each build artifact, store the artifacts' sha512 checksum values from the chacra server, and validate that the checksums match our cached files. If they do not match, exit with an error.

The purpose of this change is to make it safer to trust the download operations and cache integrity.

ahills commented 8 years ago

Nice! :ship:

Caveat: existing caches should be deleted after this change so that all cached downloads are verified.

ktdreyer commented 8 years ago

Thanks!

Caveat: existing caches should be deleted after this change so that all cached downloads are verified.

PackageArtifact performs cache lookup and population operations in the download() method, and at the end of this method, it'll now verify the checksums (the verify_checksum() method), so existing caches should be ok.

ahills commented 8 years ago

Oops, I misread download() and thought it was short-circuiting when it found the file at cache_dest. Do you think it's worth caching the checksums too?

ktdreyer commented 8 years ago

Yeah, that docstring could be clearer there!

It's a little slower to dynamically query chacra's API for the checksums on each run, but I think the safety benefit of using fresh checksum data outweighs the performance hit of those extra HTTP GET requests for the JSON. There's a lot we could parallelize to improve performance elsewhere - it seems like most of the time is spent after we're entirely done with the network, and we're just chugging through the reprepro operations (serially :cry: ).

I've also filed a ticket to ensure we verify checksums after upload, too: https://github.com/ceph/chacractl/issues/23

ktdreyer commented 8 years ago

@ahills do you have rights to merge this now?

ahills commented 8 years ago

Yes; do you want a plain or squash merge?

ktdreyer commented 8 years ago

A plain merge is fine, thanks.

I ran an anecdotal test with this change for RHCS 2, and I couldn't measure the time this adds, because all the reprepro operations can vary by a minute or more. This branch came out ahead by two seconds, welp