Closed ktdreyer closed 8 years ago
Nice! :ship:
Caveat: existing caches should be deleted after this change so that all cached downloads are verified.
Thanks!
Caveat: existing caches should be deleted after this change so that all cached downloads are verified.
PackageArtifact
performs cache lookup and population operations in the download()
method, and at the end of this method, it'll now verify the checksums (the verify_checksum()
method), so existing caches should be ok.
Oops, I misread download()
and thought it was short-circuiting when it found the file at cache_dest
. Do you think it's worth caching the checksums too?
Yeah, that docstring could be clearer there!
It's a little slower to dynamically query chacra's API for the checksums on each run, but I think the safety benefit of using fresh checksum data outweighs the performance hit of those extra HTTP GET requests for the JSON. There's a lot we could parallelize to improve performance elsewhere - it seems like most of the time is spent after we're entirely done with the network, and we're just chugging through the reprepro operations (serially :cry: ).
I've also filed a ticket to ensure we verify checksums after upload, too: https://github.com/ceph/chacractl/issues/23
@ahills do you have rights to merge this now?
Yes; do you want a plain or squash merge?
A plain merge is fine, thanks.
I ran an anecdotal test with this change for RHCS 2, and I couldn't measure the time this adds, because all the reprepro operations can vary by a minute or more. This branch came out ahead by two seconds, welp
Prior to this change, we would blindly trust the files in the cache were good to use without validating the cached files' contents. This could cause issues when rhcephcompose crashed and left the cache in an invalid state, or some network interruption or other disk issue caused our cache files to exist with invalid contents.
When parsing the data for each build artifact, store the artifacts' sha512 checksum values from the chacra server, and validate that the checksums match our cached files. If they do not match, exit with an error.
The purpose of this change is to make it safer to trust the download operations and cache integrity.