Open konstin opened 8 months ago
For me without strace it takes about 6ms, with strace it takes about 10ms. From the execve call up to prlimit64(0, RLIMIT_STACK, ...)
(which is still before the main function executes) takes 9ms. After that is a tiny of of time initializing jemalloc. The time between the rust main function being called and the process exiting is less than 1ms total.
Python is only a 6.6MB executable with basically no dylib dependencies. Rustc on the other hand has 263MB worth of dynamic libraries which it needs to load outside of libc. Even just calling mprotect on the mapped dynamic libraries takes 5ms already.
i wonder if it would be possible to dlopen LLVM at runtime so it can be delayed until codegen starts. then only the rustc_driver shared object has to be opened unconditionally (and maybe even that can be dlopen-ed if argument parsing moves to the rustc-main binary?)
We used to dlopen librustc_codegen_llvm.so (to support separate LLVM versions for emscripten and for regular use, no longer necessary as emscripten now uses the upstream wasm backend rather than the asm.js fastcomp backend), but it was merged into librustc_driver.so for perf reasons.
The performance wins that https://github.com/rust-lang/rust/pull/97154 would provide (if we could do that without breaking codegen backends) seem likely to help substantially with this. That might be worth revisiting.
Maybe related to rustc --version
doing more than it is supposed to do? https://github.com/rust-lang/rust/issues/127649
Maybe related to
rustc --version
doing more than it is supposed to do? #127649
That is performed by the rustup wrapper, not by rustc
directly, so that is not related to this issue.
Yeah, I wasn't even really aware of the rustup wrapper behaving as a proxy in the first place.
Problem Description
Running
rustc --version
without the rustup wrapper takes 11ms on my linux machine (See https://github.com/rust-lang/rustup/issues/2626 for the rustup side of this).This is an issue for uv, as we've been asked to include the output of
rustc --version
in our user agent when making requests to the python package index so the python ecosystem gets usage stats. A minimal resolution with a network request (revalidation request) takes ~100ms on machine, so 20ms extra before the first network request is noticeable. I'd also be happy to read the default rustc version from another location, given that this works with alternative ways of installation.Benchmarks
The benchmark runs from my user home on ubuntu, and i've include rustc with and without rustup, python without shim and node with volta shim and without for comparison. Tested with rustc 1.76.0 (07dca489a 2024-02-04).
On a low-end server and a shared server the contrast to python becomes even more stark: