Open rbtcollins opened 5 years ago
This does sound great to have. I believe our servers at the moment are very dumb, and I think we probably need to keep them that way, but if we can pre-calculate the information we need, that would be fine.
I don't have the time to take on building this at this point, but I do have the time to help someone interested work through technology choices, interactions and implications on different platforms, and likely constraints/desired capabilities for the components we install to make this really fly.
I would like to take this. @rbtcollins What do I need to do?
Describe the problem you are trying to solve
Toolchain updates are currently very inefficient, particularly with documentation, most of the new content is identical to the previous content; this wastes internet bandwidth for low bandwidth regions, and we currently churn on disk - removing the unaltered content and then replacing it, which has knock on effects on search indices, for a net waste of CPU and battery
Describe the solution you'd like Installing an update that is 99% the same as the previous version would only take 1% of the resources - a small download and a small number of writes to disk.
Notes This was discussed in the rustup-wg meeting 23/4/19; it isn't currently resourced but it was considered an interesting idea.
I think we have broadly two basic performance sensitive use cases: 1) First install: whether interactive or automated for a CI job. This needs to be decently fast, so we cannot ignore its performance. No atomicity is needed: rust isn't working before the install begins, and until it completes thats understood. 2) Daily/adhoc maintenance: almost always human initiated I suspect, but perhaps not? This needs to also be decently fast, but ideally it would be faster than the initial install (currently it is slower because we have to remove the old content) As rust was working before it was run, minimising the time period during which a given toolchain doesn't work is desirable
I'm not sure if the rollback/transactional aspect really is important to preserve as-is; I rather suspect that a recoverable model would be better - more flexible for this sort of optimisation, and a better fit for reality (because reality doesn't guarantee processes can complete).
A rustup modified to work like this would look something like the following:
a
to known-statea'
.There are some possible complications - e.g. do we have components nested within others, which many distribution systems are unlikely to like? - but these should be able to be worked through with some care.
If someone wants to test to see what the potential benefits might be, just measure updating .rustup/toolchains from another machine using some delta system (rsync/dropbox/bittorrent/etc) ; key examples would be nightly and stable.