rust-mobile / cargo-subcommand

Library for implementing cargo subcommands
6 stars 6 forks source link

Calling `rustc -vV` up front is slow #19

Closed Shnatsel closed 2 years ago

Shnatsel commented 2 years ago

Right now the following line alone takes 90ms on my machine:

cargo_subcommand::Subcommand::new(std::env::args(), "testing", |a, b| Ok(false)).unwrap();

This does not sound like much, but it is really noticeable because CLI tooling is usually very responsive.

I've profiled it, and all the time is spent in various external rust and cargo commands.

I think it would be a lot better to call these methods on demand instead of calling all of them up front and caching the results in the struct. Right now this eager caching slows down my tool instead of speeding it up, since I don't use most of the fields that it eagerly resolves.

MarijnS95 commented 2 years ago

Searching for Command::new, the only thing that seems to be invoked by Subcommand::new is rustc -vV?

$ time rustc -vV
rustc 1.59.0 (9d1b2106e 2022-02-23)
binary: rustc
commit-hash: 9d1b2106e23b1abd32fce1f17267604a5102f57a
commit-date: 2022-02-23
host: x86_64-unknown-linux-gnu
release: 1.59.0
LLVM version: 13.0.0
rustc -vV  0.04s user 0.01s system 98% cpu 0.057 total

Does this align with what you're seeing?

Shnatsel commented 2 years ago

Yes, I can confirm that this is the only command invoked - I've checked with strace.

On my machine this is slow because I have rustc installed through rustup, and it adds some sort of shim that takes a while to execute.

$ hyperfine 'rustc -vV'
Benchmark 1: rustc -vV
  Time (mean ± σ):      83.2 ms ±   3.0 ms    [User: 70.4 ms, System: 12.1 ms]
  Range (min … max):    81.0 ms …  94.1 ms    34 runs
MarijnS95 commented 2 years ago

@Shnatsel As such, would you be open to change the title and description of this issue to be solely about a single rustc -vV invocation for populating _just the host_triple field_? This seems like one isolated case as confirmed by your benchmarks, making "Resolving all subcommand fields" largely overexaggerated to say the least.

Shnatsel commented 2 years ago

Sure, done. My apologies for the false alarm.

Shnatsel commented 2 years ago

I would prefer for the other fields also be resolved lazily, or made into standalone functions, for two reasons:

  1. The file I/O may be slow on some platforms, even though it's not the case on my machine (e.g. on networked filesystems)
  2. If the subcommand only accesses the files it really needs, sandboxing it is a lot simpler.

But neither of that is a pressing issue for me.

dvc94ch commented 2 years ago

we probably don't need to call rustc at all. we can instead rely on the platform cargo subcommand was compiled for by doing something like #[cfg(all(target_os = "..", target_arch = ".."))].

Shnatsel commented 2 years ago

The platform crate used to provide such a facility, until my rewrite for 3.0. I dropped it because there wasn't a clear use case for it, and it was extra work. But I guess I can restore it?

Shnatsel commented 2 years ago

I've just checked, and unfortunately it is infeasible to reconstruct the target triple based on cfg values. There are platforms with different target triples but identical cfg values, here's a code snippet that finds them: https://github.com/rustsec/rustsec/pull/516/commits/5107570ce159acddc480e321308499610fceec96

Even taking into account all cfg values in existence, not just the ones recognized by the platforms crate, it is still impossible. Run this script on Linux in an empty directory for proof:

rustc --print=target-list | parallel 'rustc --print=cfg --target={} > ./{}'
fdupes

You will see that fdupes prints a lot of duplicate files, i.e. targets with completely identical cfg values.

MarijnS95 commented 2 years ago

One thing at the time. We've already voiced some interest in cleaning up manifest parsing as the current util functions might unnecessarily load and parse the same file many times. Let's focus on this high-grossing issue first before "profiling" again (on IO-constrained systems).

Tbh I wouldn't mind getting rid of this functionality entirely. It doesn't seem used by cargo-subcommand nor cargo-apk for that matter, so we might as well get rid of it.

Alternatively this functionality could be moved to the pub fn host_triple() getter entirely, with a comment explaining potential overhead/slowness if so inclined.

Shnatsel commented 2 years ago

I do use that functionality in cargo auditable. But I thought I might get away with querying it only in some cases, and not every time. And yeah, a standalone getter sounds nice.

MarijnS95 commented 2 years ago

@Shnatsel Would you rather have this logic in your own crate, or do you see a place for it in cargo-subcommand?

@dvc94ch If you're not using it either cargo-subcommand doesn't seem the right place for it, and we probably don't want to take the "maintenance burden" for it.

dvc94ch commented 2 years ago

x isn't actually using cargo-subcommand yet. and 0.1.0 won't use cargo-subcommand. once 0.1.0 is out I might go through the trouble of migrating to cargo-subcommand.

MarijnS95 commented 2 years ago

@dvc94ch Any other tool(s) we should keep in mind? Otherwise I'm happy to just get rid of this :)

Shnatsel commented 2 years ago

I think this is useful for cargo subcommand. I would be very happy to have a host_platform() function or some such. This is somewhat commonly needed in subcommands.

Shnatsel commented 2 years ago

You know what, nevermind. I'm probably going to make a separate crate that encodes the platform at compile time.

Shnatsel commented 2 years ago

Okay, I got compile-time platform detection working. It seems to be fast and reliable, and works with rustc as old as 1.19. However, my employer insists on pre-reviewing anything I publish, so it will take a couple of weeks for me to actually publish the code.

Shnatsel commented 2 years ago

I've released compile time-detection as a crate: https://github.com/Shnatsel/current_platform

dvc94ch commented 2 years ago

That's quite neat, I like it