rust-lang / rustup

The Rust toolchain installer
https://rust-lang.github.io/rustup/
Apache License 2.0
6.18k stars 891 forks source link

Remote Custom Toolchains #2696

Open XAMPPRocky opened 3 years ago

XAMPPRocky commented 3 years ago

Describe the problem you are trying to solve It would be nice to be able to specify custom toolchains at a remote location, so that you can very easily download and install custom toolchain using Rustup. So that other distributors can just provide URLs to their distribution, and not require users to manually download and link the toolchain themselves. Of course this remote location would have to follow the same conventions as Rust's CI, but that's not really an issue since most users of this feature would distributing using x.py anyway, just to their servers.

Describe the solution you'd like

rustup toolchain install --custom custom-channel https://custom.example.org

Unresolved Questions

ehuss commented 3 years ago

Does using the RUSTUP_DIST_SERVER environment variable address your use case? I think something like RUSTUP_DIST_SERVER=https://rust.example.com/ rustup install custom-channel should work.

XAMPPRocky commented 3 years ago

Well I think it might technically fit, though I don't think I'm familiar enough with rustup's internals to say whether it completely would. For example how does that rustup update without RUSTUP_DIST_SERVER being set? But I think terms of usability, I'd like something a lot more user friendly, as environment variables aren't really as cross platform as having it in a config file. Ideally you would be able to use this toolchain just like any other channel.

kinnison commented 3 years ago

We've wanted to support alternative dist servers for a while now, so this is definitely something we'd like to explore. There may be issues I can link to this, but I don't have them to hand right now.

I have a rough sketch idea of how this might work, but it'll need fleshing out and exploring so if someone wanted to work on this, we should probably start with a zoom/gmeet/whatever to discuss the approach idea I have.

chansuke commented 3 years ago

I would like to try to work on this

kinnison commented 3 years ago

We should make a plan to have a realtime discussion about what this might entail then, so that you can work on a design document. This is quite a big task and I'd rather not run at it half-cocked. What timezone are you in @chansuke ?

chansuke commented 3 years ago

@kinnison Thank you for your comment. I'm in JST.

kinnison commented 3 years ago

JST is UTC+9. Will your clocks change next week? If so then you stay 9 hours ahead of me, if not, then you'll become only 8h ahead. Either way I'm guessing that unless you're a night owl, the best way for us to have a chat would be in my morning, your evening. I shall see what I can find in terms of available time either tomorrow or Friday, or else early next week, and then propose a set of options for you.

ZuseZ4 commented 3 years ago

@kinnison Do you know what the current state of this issue is? I would like to have rustup for downloading a toolchain instead of my library pulling things from third-party servers. If needed I can also contribute to this. However, I would have expected this to be a comparably small change, while you indicated that this might be a larger task. So if you want we can set up an online meeting to figure things out. I'm in CEST, so it should be comparably easy.

kinnison commented 3 years ago

@ZuseZ4 I'm afraid I've not seen any further work on this from anyone. If you're interested in helping us to work out the right way to do this and then implementing it then that'd be cool. It is unlikely to be a particularly small change because there'll be a lot of Rustup which needs checking to ensure that assumptions hold properly. For example, currently dist toolchains (i.e. ones which can contain components, have a manifest file to lay them out, etc.) are always fetched from the RUSUP_DIST_ROOT/RUSTUP_DIST_SERVER so there'll need to be a mechanism to ensure we know which toolchains are related to which dist servers at minimum, along with adjusting all the plumbing to be able to cope with all of that. Then there're expected behaviours such as "setting the RUSTUP_DIST_* env-vars before rustup $command is expected to override the server for that toolchain (or all toolchains if not otherwise specified) but only for the duration of that command.

And then there's the question of how to parse toolchain names and decide what they are. Currently numbered toolchains, stable beta* and nightly* are considered to be distribution toolchains, whereas anything else is a custom toolchain. We'd likely end up needing some kind of registry of servers and the toolchain name patterns which are to be considered dist for that server, at which point the current static regular expressions for parsing toolchains will need adjusting to understand those other dist servers.

This really isn't a small job. I think a good first start would be to define what the UX would be like for this feature, i.e. what you want your users to be running, how would it interact with them and with the servers and $RUSTUP_HOME. Once we have a goal UX we can think about what it'd take internally to get toward that point.

ZuseZ4 commented 3 years ago

@kinnison First of all thanks for the explanation. Sorry for the late answer. I had to get some other PRs done before and didn't noticed how the time passed.

My job: Integrating Enzyme into Rust. Enzyme is an auto-diff tool and requires llvm-bc and debug-infos from the source code. It creates a library based on the new llvm-bc which should be linked against the compiled source code at certain, marked positions.

My fallback solution / earlier plan Letting user run cargo install enzyme and building enzyme/llvm/clang/rustc locally. Providing an executable cargo-enzyme which acts as a cargo wrapper and does the background steps mentioned above.

My UX goal Users do something like rustup toolchain install --custom-remote https://enzyme.mit.edu/releases enzyme once. This downloads rustc/clang/libLLVM/Enzyme + whatever is needed from the server. A few weeks later they call rustup update which looks for a newer enzyme stack and possibly replaces the older one. To run their code they use cargo +enzyme run --release If people run cargo run --release without the Enzyme toolchain we just take the library created during an earlier run and link it. If we can't find a library we give a useful error message. Generally: People don't install enzyme anymore, don't waste hours compiling and only use it as a normal cargo.toml dependency to mark sections on which enzyme should work. This is possibly over-simplified, so let me know if I dropped details which are relevant to you. I'm also not 100% convinced that we can get there completely.

How close is my user story to your expectations @kinnison ?

Implementation I don't know much about the actual implementation, the following is just the first thought I got to simplify our further work. What do you think about having a first iteration where we do "nothing" but tagging toolchains as official/local/custom-remote? The custom-remote one would be the one which we finally want to add. Then we can allow / reject certain operations directly based on the type of a toolchain. Keeping extra handling for official toolchains just based on some string-matching looks dangerous to me. In Pseudocode we would have something like `struct official { dist_server: Some(URL), local_name: Path }` `struct local { dist_server: None, local_name: Path }` `struct custom-remote { dist_server: Some(URL), local_name: Path }` Possibly having an `Option` for packed download artifacts. The similar layout might still allow sharing most toolchain handling code between those. In order to create them we have `rustup toolchain install ` for the official ones `rustup toolchain install custom-remote Option()` `rustup toolchain install custom-local ` // alias to link? We simply cancel operations if the local-name is already taken by another toolchain. These three are simple to distinguish and you can't accidentally mix something up. The component / manifest layout would be expected to be identical among all three (except that remote ones might be packed before downloading). In order to delete them we have `rustup toolchain uninstall which works for all. rustup check/update would do nothing for local toolchains and it handles the custom-remote ones identical to the official ones, except of using a different address. So logically it looks if there is a higher version number available at ``. If it finds one it deletes `` and gets the new version with the higher version number. For the beginning I would start with not allowing any usage of custom-remote toolchains except of (un)install and just trying to get the existing official / local ones working. Pro: Limited scope of first changes. Probably easy to re-use logic between different toolchains. Clean separation of those three. RUSTUP_DIST_* could be checked at the beginning of a command and if set be used to create an in-memory clone of the struct with a changed URL which won't be saved to a file and thus won't have further effects. Con: We need to update every location where toolchains are used. I can imagine that this is sufficient to already make it unfeasible? This first iteration also wouldn't result in any improvement and be nothing but an overly complex download tool.
MabezDev commented 3 years ago

Does using the RUSTUP_DIST_SERVER environment variable address your use case? I think something like RUSTUP_DIST_SERVER=https://rust.example.com/ rustup install custom-channel should work.

@ehuss Could you, or anyone else point me to any docs/repos/examples on how https://static.rust-lang.org/ is hosted; specifically the structure and manifest files so that we can try and achieve what you're suggesting above?

jessebraham commented 3 years ago

@mabezdev Some links that may be helpful: