Checksum failed (again)

alexbool commented 8 years ago

I don't know if this is related to previous bugs with the same name, so filing a new one.

alexbool@alexbool-osx ~> rustup -v update nightly
verbose: read metadata version: '12'
verbose: updating existing install for 'nightly-x86_64-apple-darwin'
verbose: toolchain directory: '/Users/alexbool/.multirust/toolchains/nightly-x86_64-apple-darwin'
info: syncing channel updates for 'nightly-x86_64-apple-darwin'
verbose: creating temp file: /Users/alexbool/.multirust/tmp/xzj52e23ecrqyt5o_file
verbose: downloading file from: 'https://static.rust-lang.org/dist/channel-rust-nightly.toml.sha256'
verbose: deleted temp file: /Users/alexbool/.multirust/tmp/xzj52e23ecrqyt5o_file
verbose: creating temp file: /Users/alexbool/.multirust/tmp/b6yr1ng0mairn39o_file.toml
verbose: downloading file from: 'https://static.rust-lang.org/dist/channel-rust-nightly.toml'
verbose: deleted temp file: /Users/alexbool/.multirust/tmp/b6yr1ng0mairn39o_file.toml
error: checksum failed, expected: '3721ec0a1b3a063da1f4fe594a0f4afdefefcb6fd4c3453b848efd980e4530a9', calculated: 'dc069d0e5b653add4afee522e2f139a9f1479ca5967694e9ef351e1a5648b0b1'

brson commented 8 years ago

Thanks for the report. I'd imagine this has resolved itself by now?

It's probably not the same issue we saw over the weekend. We have constant problems with our CDN invalidating our binary artifacts at different times, causing the .sha256 files to disagree with the files they correspond to. I believe what you are seeing is the effects of this bug.

It's super frustrating, but I've just taken another look at our CDN invalidation script and it looks right.

Even though I think this is an upstream problem, I'm going to leave this open for others to find.

brson commented 8 years ago

After thinking about this further there is only one scenario where we should possibly be seeing these hash mismatches, and that's when comparing the manifest with its hash. Other hashes are all contained in the manifest themselves and refer to files with unique names that don't require CDN invalidation.

It may be possible to tweak the manifest format further to avoid this problem. For example, the manifest itself could be a toml file prepended with a 60 byte sha.

Could also just not validate the manifest and leave errors to happen during parsing and downloading of bins.

alexbool commented 8 years ago

Yes, it resolved something like 20 minutes after I filed it.

brson commented 8 years ago

I'm considering just not checking either the hash of the manifest or the hash of the self update, the two problematic hashes.

I don't know anything that validating these two hashes accomplishes. As a checksum it's redundant since the HTTPS transfer will guarantee the bits get to us correctly.

brson commented 8 years ago

Heh. The hashes here let you check whether updates exist before downloading, an important function. I'm working on other workarounds.

est31 commented 8 years ago

@brson what about using HTTP ETags for this? They were designed for precisely this purpose.

brson commented 8 years ago

@est31 From a quick googling I don't know that I can use etags to fix the syncing from S3 to CloudFront. Do you know how to do it?

brson commented 8 years ago

Yes, it's possible we could use etags instead of the sha for checking for updates.

zen0wu commented 8 years ago

Happening to me with the nightly toolchain

brson commented 8 years ago

With some changes to the data layout we should be able to get rid of the drift. The idea is that each releases binaries are uploaded to a unique directory (the Rust and rustup archives already partially accomplish this), and there is a single latest directory which is just a symlink to the latest archive directory. That way there is one atomic update (the symlink update) that changes both the artifact and the hash.

Concretely, we have to change the scheme used in two places - the rustup bins and the rust manifests. For rustup we'll need to add a s.rlo/rustup/latest directory, which symlinks to the existing s.rlo/rustup/archive/$latest. rustup will need to change to download self-updates from the new location.

For rust, we'll need to add a s.rlo/dist/latest-nightly directory (and for each channel) that again symlinks to the existing archives, and rustup will need to be modified to use it.

To make this work on CloudFront we'll need to set up multiple origins. Some IRC conversation:

15:12 < eternaleye> Aha!
15:12 < eternaleye> https://www.linkedin.com/pulse/how-use-amazon-cloudfront-application-router-chris-iona
15:13 < eternaleye> Looks like you can do two origins (one for the whole thing, one pointing to the actual value of the current latest dir), and use a
                    behavior that maps /latest to the latter
15:13 < eternaleye> At which point all that remains is doing that programmatically
15:13 < eternaleye> (changing the path part of the second origin)
15:14 < eternaleye> Which looks like it can be done via http://docs.aws.amazon.com/AmazonCloudFront/latest/APIReference/DistributionConfigDatatype.html
15:16 < eternaleye> brson: Could that work? Buildboxes upload to S3, then prod a CloudFront API to switch the /latest pointer?
15:18 < eternaleye> Basically, once the S3 uploads are done, they'd GET the DistributionConfig, find the <Origin> with the correct ID, update the
                    <OriginPath>, and PUT it back.

cc @eternaleye

wrouesnel commented 6 years ago

Happening to me now with 1.22. Worse: when it fails rustup insists that it installed something, but none of the right directories exist:

$ rustc
error: toolchain 'stable' is not installed
info: caused by: not a directory: '/home/will/.rustup/toolchains/stable-x86_64-unknown-linux-gnu'

$ rustup -v update stable
verbose: read metadata version: '12'
verbose: installing toolchain 'stable-x86_64-unknown-linux-gnu'
verbose: toolchain directory: '/home/will/.rustup/toolchains/stable-x86_64-unknown-linux-gnu'
info: syncing channel updates for 'stable-x86_64-unknown-linux-gnu'
verbose: creating temp file: /home/will/.rustup/tmp/44bw8qm28iloqbla_file
verbose: downloading file from: 'https://static.rust-lang.org/dist/channel-rust-stable.toml.sha256'
verbose: downloading with curl
verbose: deleted temp file: /home/will/.rustup/tmp/44bw8qm28iloqbla_file
verbose: no update hash at: '/home/will/.rustup/update-hashes/stable-x86_64-unknown-linux-gnu'
verbose: creating temp file: /home/will/.rustup/tmp/n9r4ytoz254b0pqt_file.toml
verbose: downloading file from: 'https://static.rust-lang.org/dist/channel-rust-stable.toml'
verbose: downloading with curl
236.7 KiB / 236.7 KiB (100 %) 134.3 KiB/s ETA:   0 s                
verbose: deleted temp file: /home/will/.rustup/tmp/n9r4ytoz254b0pqt_file.toml
info: update not yet available, sorry! try again later
verbose: toolchain is already up to date

  stable-x86_64-unknown-linux-gnu unchanged - (toolchain not installed)

slondr commented 3 years ago

I'm getting this today too

kinnison commented 3 years ago

@slondr if this persists, please let us know, but this might be drift on the CDNs

rami3l commented 4 months ago

Closing this now as this is not really a Rustup issue, but an issue with the release server.

If you are encountering another wave of failures with the official release server, please report under https://rust-lang.zulipchat.com/#narrow/stream/t-infra as suggested in https://github.com/rust-lang/rustup/issues/3390#issuecomment-1635611300.

rust-lang / rustup

Checksum failed (again) #346