Closed alexbool closed 4 months ago
Thanks for the report. I'd imagine this has resolved itself by now?
It's probably not the same issue we saw over the weekend. We have constant problems with our CDN invalidating our binary artifacts at different times, causing the .sha256 files to disagree with the files they correspond to. I believe what you are seeing is the effects of this bug.
It's super frustrating, but I've just taken another look at our CDN invalidation script and it looks right.
Even though I think this is an upstream problem, I'm going to leave this open for others to find.
After thinking about this further there is only one scenario where we should possibly be seeing these hash mismatches, and that's when comparing the manifest with its hash. Other hashes are all contained in the manifest themselves and refer to files with unique names that don't require CDN invalidation.
It may be possible to tweak the manifest format further to avoid this problem. For example, the manifest itself could be a toml file prepended with a 60 byte sha.
Could also just not validate the manifest and leave errors to happen during parsing and downloading of bins.
Yes, it resolved something like 20 minutes after I filed it.
I'm considering just not checking either the hash of the manifest or the hash of the self update, the two problematic hashes.
I don't know anything that validating these two hashes accomplishes. As a checksum it's redundant since the HTTPS transfer will guarantee the bits get to us correctly.
Heh. The hashes here let you check whether updates exist before downloading, an important function. I'm working on other workarounds.
@brson what about using HTTP ETags for this? They were designed for precisely this purpose.
@est31 From a quick googling I don't know that I can use etags to fix the syncing from S3 to CloudFront. Do you know how to do it?
Yes, it's possible we could use etags instead of the sha for checking for updates.
Happening to me with the nightly toolchain
With some changes to the data layout we should be able to get rid of the drift. The idea is that each releases binaries are uploaded to a unique directory (the Rust and rustup archives already partially accomplish this), and there is a single latest
directory which is just a symlink to the latest archive directory. That way there is one atomic update (the symlink update) that changes both the artifact and the hash.
Concretely, we have to change the scheme used in two places - the rustup bins and the rust manifests. For rustup we'll need to add a s.rlo/rustup/latest
directory, which symlinks to the existing s.rlo/rustup/archive/$latest
. rustup will need to change to download self-updates from the new location.
For rust, we'll need to add a s.rlo/dist/latest-nightly
directory (and for each channel) that again symlinks to the existing archives, and rustup will need to be modified to use it.
To make this work on CloudFront we'll need to set up multiple origins. Some IRC conversation:
15:12 < eternaleye> Aha!
15:12 < eternaleye> https://www.linkedin.com/pulse/how-use-amazon-cloudfront-application-router-chris-iona
15:13 < eternaleye> Looks like you can do two origins (one for the whole thing, one pointing to the actual value of the current latest dir), and use a
behavior that maps /latest to the latter
15:13 < eternaleye> At which point all that remains is doing that programmatically
15:13 < eternaleye> (changing the path part of the second origin)
15:14 < eternaleye> Which looks like it can be done via http://docs.aws.amazon.com/AmazonCloudFront/latest/APIReference/DistributionConfigDatatype.html
15:16 < eternaleye> brson: Could that work? Buildboxes upload to S3, then prod a CloudFront API to switch the /latest pointer?
15:18 < eternaleye> Basically, once the S3 uploads are done, they'd GET the DistributionConfig, find the <Origin> with the correct ID, update the
<OriginPath>, and PUT it back.
cc @eternaleye
Happening to me now with 1.22. Worse: when it fails rustup insists that it installed something, but none of the right directories exist:
$ rustc
error: toolchain 'stable' is not installed
info: caused by: not a directory: '/home/will/.rustup/toolchains/stable-x86_64-unknown-linux-gnu'
$ rustup -v update stable
verbose: read metadata version: '12'
verbose: installing toolchain 'stable-x86_64-unknown-linux-gnu'
verbose: toolchain directory: '/home/will/.rustup/toolchains/stable-x86_64-unknown-linux-gnu'
info: syncing channel updates for 'stable-x86_64-unknown-linux-gnu'
verbose: creating temp file: /home/will/.rustup/tmp/44bw8qm28iloqbla_file
verbose: downloading file from: 'https://static.rust-lang.org/dist/channel-rust-stable.toml.sha256'
verbose: downloading with curl
verbose: deleted temp file: /home/will/.rustup/tmp/44bw8qm28iloqbla_file
verbose: no update hash at: '/home/will/.rustup/update-hashes/stable-x86_64-unknown-linux-gnu'
verbose: creating temp file: /home/will/.rustup/tmp/n9r4ytoz254b0pqt_file.toml
verbose: downloading file from: 'https://static.rust-lang.org/dist/channel-rust-stable.toml'
verbose: downloading with curl
236.7 KiB / 236.7 KiB (100 %) 134.3 KiB/s ETA: 0 s
verbose: deleted temp file: /home/will/.rustup/tmp/n9r4ytoz254b0pqt_file.toml
info: update not yet available, sorry! try again later
verbose: toolchain is already up to date
stable-x86_64-unknown-linux-gnu unchanged - (toolchain not installed)
I'm getting this today too
@slondr if this persists, please let us know, but this might be drift on the CDNs
Closing this now as this is not really a Rustup issue, but an issue with the release server.
If you are encountering another wave of failures with the official release server, please report under https://rust-lang.zulipchat.com/#narrow/stream/t-infra as suggested in https://github.com/rust-lang/rustup/issues/3390#issuecomment-1635611300.
I don't know if this is related to previous bugs with the same name, so filing a new one.