"Fetching module extension" taking a long time to complete

Hey folks,

We have a large monorepo comprising of 357 crates. We recently switched to bzlmod to avoid very frequent issues with Cargo.bazel.lock merge conflicts, and in that regard the transition went great.

However, we're now seeing the Fetching module extension crate in @@rules_rust~//crate_universe:extension.bzl; starting step taking from a minimum of 20s to upwards of two minutes. This runs on any bazel build or bazel mod deps --lockfile_mode=update provided one or more Cargo.toml declared in the manifests key of .from_cargo is touched, which occurs very frequently.

Is there a way we could speed up this step in our configuration?

This is our MODULE.bazel file:

bazel_dep(name = "rules_rust", version = "0.50.1")

rust = use_extension("@rules_rust//rust:extensions.bzl", "rust")
rust.toolchain(
    edition = "2021",
    extra_target_triples = [
        "aarch64-apple-ios-sim",
        "aarch64-apple-ios",
        "aarch64-linux-android",
        "x86_64-unknown-linux-gnu",
        "x86_64-apple-ios",
        "wasm32-unknown-unknown",
    ],
    versions = ["nightly/2024-08-23"],
)
use_repo(
    rust,
    "rust_toolchains",
)

register_toolchains("@rust_toolchains//:all")

crate = use_extension("@rules_rust//crate_universe:extension.bzl", "crate")
crate.from_cargo(
    name = "crates",
    cargo_lockfile = "//rs:Cargo.lock",
    manifests = [
        "//rs:Cargo.toml", # Workspace Cargo.toml
        # ... 300+ crates
    ],
)
crate.annotation(
    crate = "rdkafka-sys",
    gen_build_script = "off",
)
crate.annotation(
    crate = "log",
    # Needed to completely disable debug logs in release mode
    crate_features = ["max_level_trace", "release_max_level_info"],
)
use_repo(crate, "crates")

A big part of that time is lost in execute_cargo_tree, where for each target triple (of which there are 34 by default) triggers a cargo tree call, and all or part of these calls are queued up because they require exclusive access to the package cache.

The simplest solution for us at this point is to specify supported_platform_triples to the ones we're actually interested in building on. This reduces the time it takes to re-run on an unchanged project from 29s down to 12s.

Considering a single cargo tree calls takes 1s to complete, it would be great to find a way to run all these computations in parallel.

However, even with a single platform specified, it still takes a total of 9 seconds, so there's more to be optimized elsewhere as well.

bazelbuild / rules_rust

"Fetching module extension" taking a long time to complete #2876