rust-lang / rustup

The Rust toolchain installer
https://rust-lang.github.io/rustup/
Apache License 2.0
6.17k stars 888 forks source link

More efficient incremental updates #1798

Open rbtcollins opened 5 years ago

rbtcollins commented 5 years ago

Describe the problem you are trying to solve

Toolchain updates are currently very inefficient, particularly with documentation, most of the new content is identical to the previous content; this wastes internet bandwidth for low bandwidth regions, and we currently churn on disk - removing the unaltered content and then replacing it, which has knock on effects on search indices, for a net waste of CPU and battery

Describe the solution you'd like Installing an update that is 99% the same as the previous version would only take 1% of the resources - a small download and a small number of writes to disk.

Notes This was discussed in the rustup-wg meeting 23/4/19; it isn't currently resourced but it was considered an interesting idea.

I think we have broadly two basic performance sensitive use cases: 1) First install: whether interactive or automated for a CI job. This needs to be decently fast, so we cannot ignore its performance. No atomicity is needed: rust isn't working before the install begins, and until it completes thats understood. 2) Daily/adhoc maintenance: almost always human initiated I suspect, but perhaps not? This needs to also be decently fast, but ideally it would be faster than the initial install (currently it is slower because we have to remove the old content) As rust was working before it was run, minimising the time period during which a given toolchain doesn't work is desirable

I'm not sure if the rollback/transactional aspect really is important to preserve as-is; I rather suspect that a recoverable model would be better - more flexible for this sort of optimisation, and a better fit for reality (because reality doesn't guarantee processes can complete).

A rustup modified to work like this would look something like the following:

There are some possible complications - e.g. do we have components nested within others, which many distribution systems are unlikely to like? - but these should be able to be worked through with some care.

If someone wants to test to see what the potential benefits might be, just measure updating .rustup/toolchains from another machine using some delta system (rsync/dropbox/bittorrent/etc) ; key examples would be nightly and stable.

nrc commented 5 years ago

This does sound great to have. I believe our servers at the moment are very dumb, and I think we probably need to keep them that way, but if we can pre-calculate the information we need, that would be fine.

rbtcollins commented 5 years ago

I don't have the time to take on building this at this point, but I do have the time to help someone interested work through technology choices, interactions and implications on different platforms, and likely constraints/desired capabilities for the components we install to make this really fly.

pickfire commented 5 years ago

I would like to take this. @rbtcollins What do I need to do?