Open ishitatsuyuki opened 5 years ago
For the most part, the compression scheme selected is up to the tool which generates the compiler releases. As far as I know, however, the manifest format only allows for a single download file per component, so we wouldn't be able to choose compression methods for the user. I imagine that we use xz for compression ratio reasons.
@ishitatsuyuki do you have some profiling data showing that decompression is a bottleneck here? One thing we could very easily do if it is a bottleneck is move decompression to a dedicated thread.
Reference file: 183M rust-std-1.34.2-x86_64-pc-windows-gnu.tar
CPU: Intel(R) Core(TM) i7-6500U CPU
Compression | Compressed size | Download time | Decompression time |
---|---|---|---|
gz | 71M | 4.4s | 1.25s |
xz | 56M | 3.4s | 3.50s |
zstd (-19) | 59M | N/A | 0.34s |
You can clearly see that when the internet is fast, decompression time is a significant part of installation time (on Linux where I/O doesn't get interfered by antimalware). Moving it to a dedicated thread doesn't make anything faster since the CPU is the bottleneck.
I run a NVMe drive so files can be written almost instantly, but for rotating disks it may take much more time.
My Wi-Fi which is used for download test had an average throughput of 16.2MB/s. Using a wired connection is even faster, yielding around 80MB/s. This resembles typical datacenter networking speed.
As a bonus, zstd decompression is faster than any other methods listed here, while retaining a similar compression ratio to xz. Time to adopt.
Thank you for the profiling data.
So, lets break this into several aspects.
I have absolutely no attachment to the current compressor; but rustup is merely a consumer of the archives produced elsewhere in the ecosystem. So the adoption process is going to be:
Do you know if xz or zstd(-19) format files are already available? If not, I'm not sure where the relevant place to file a bug is. (@kinnison )
re: would moving decompression to a thread help: it may - because it allows concurrency; obviously the decompressor itself, if single threaded, will not become faster, so that would be a lower bound on the install time, but rather than T=(decompression + other handling), the install time might be as low as T=(decompression). Even on single core, 2 hardware threaded machine, there is usually room for improvement in this case.
We should obviously do the format switching work, but I'll see about a cheap message passing test with 4MB buffers in a thread when I get a chance.
In terms of assessing formats, we have binaries, libaries, and docs, I don't know that we'll get consistent results across all types; it may be worth testing across them all.
Do you know if xz or zstd(-19) format files are already available?
The distribution manifest has two urls: no suffix (gz) and xz.
In order to proceed on this, the infra team want some comparisons.
Could you please gather:
The feeling from the infra team is that adding another compression tarball is non-trivial because it'll add around half a gigabyte of artifacts per day to our s3 bucket so if we were to intend to add zstd, we'd have to drop at least one of gz or xz and we're not entirely sure how much relies on referring directly to those artifacts (i.e. not installing via rustup)
If you can gather that data, I can take the results to the infra team for further discussion.
Tested release: nightly-x86_64-unknown-linux-gnu 2019-05-24
CPU: Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz * 4
(virtualized)
Note: only single-threaded implementations are tested. pixz
did not seem to yield speed increase, pzstd
performed well but it has its own logic, zstdmt
did not support parallel decompression.
Size and decompression times:
Component | xz | xz time | zstd (-19) | zstd time |
---|---|---|---|---|
cargo | 4.6M | 0.48 | 5.1M | 0.073 |
llvm-tools | 556K | 0.068 | 612K | 0.018 |
miri | 896K | 0.103 | 1000K | 0.023 |
rust-analysis | 552K | 0.073 | 580K | 0.025 |
rust-docs | 12M | 1.54 | 12M | 0.294 |
rust-std | 62M | 4.25 | 66M | 0.495 |
rustc | 91M | 8.30 | 99M | 1.13 |
rustfmt | 2.7M | 0.301 | 3.0M | 0.058 |
sum | 172M | 15.12 | 185M | 2.13 |
rust (all-in-one) | 154M | 13.7 | 166M | 2.07 |
Difference between xz and zstd is roughtly 13MB and 13s, which means that under networks faster than 10Mb/s, zstd will perform better.
xz, as mentioned above, doesn't seem worth parallelizing. We're already at the "best compression without ridiculously slow speed" (-6), which is the default, and I don't think there's any thing to change here.
Thank you for these numbers. This implies that we can expect to see a roughly 7 to 8 percent increase in the size of a release moving from xz to zstd. Based on numbers I was given last night, a release (nightly, beta, stable) is about 25 gigabytes, and purely adding zstd would thusly likely add around 10G to that.
I shall now take this back to the infra team for further consideration. Thank you for your efforts so far.
I believe further discussion was had in #2488
Describe the problem you are trying to solve
Decompression can be an bottleneck if the network is fast. XZ (LZMA) is surprising slow and can quickly dominate the time spent on installation.
For network connections faster than broadband, we should consider not using LZMA compression. This includes home fiber and datacenters, notably CI environments.
Describe the solution you'd like
auto
option that tries to detect network speed.Notes
Ubuntu proposal case study.