This PR adds documentation on how to cache a local crate index when working with
workspaces that have large git dependencies. I devised this method after I
noticed that the cargo chef cook step was cloning a non-target git
dependency (namely, aptos-core)
during a cargo-chef build, since compilation requires a complete local crate
index.
The documentation in the PR goes into detail about the mechanisms at play, and
below I'm including an example for additional illustrative purposes.
The Dockerfile is identical to the template proposed in this PR:
FROM lukemathwalker/cargo-chef:latest-rust-1 AS chef
WORKDIR /app
FROM chef AS planner
ARG BIN
COPY . .
# Prepare recipe one directory up to simplify local crate index caching.
RUN cargo chef prepare --bin "$BIN" --recipe-path ../recipe.json
# Delete everything not required to build complete local crate index, to avoid
# invalidating local crate index cache on code changes or recipe updates.
RUN find -type f \! \( -name 'Cargo.toml' -o -name 'Cargo.lock' \) -delete && \
find -type d -empty -delete
# Invoke a dry run lockfile update against the manifest skeleton, thereby
# caching a complete local crate index.
FROM chef AS indexer
COPY --from=planner /app .
RUN cargo update --dry-run
FROM chef AS builder
ARG BIN PACKAGE
COPY --from=planner /recipe.json recipe.json
# Copy cached crate index.
COPY --from=indexer $CARGO_HOME $CARGO_HOME
# Build in locked mode to prevent local crate index cache invalidation, thereby
# downloading only the necessary dependencies for the binary.
RUN cargo chef cook --bin "$BIN" --locked --package "$PACKAGE" --release
COPY . .
# Build offline solely from cached crate index and downloaded dependencies.
RUN cargo build --bin "$BIN" --frozen --package "$PACKAGE" --release
# Rename executable for ease of copying.
RUN mv "/app/target/release/$BIN" /app/executable;
FROM debian:bookworm-slim AS runtime
COPY --from=builder /app/executable /usr/local/bin
ENTRYPOINT ["/usr/local/bin/executable"]
The Cargo.toml for my_package has no special dependencies:
[[bin]]
name = "my-bin"
path = "my_bin.rs"
[package]
edition = "2021"
name = "my_package"
version = "1.0.0"
And my_bin.rs declares a simple "Hello, world!" statement:
fn main() {
println!("Hello, world!")
}
However, the Cargo.toml for another_package has a git dependency on
aptos-core (note that per
aptos-core #8984
there is no plan to support package management on crates.io):
[[bin]]
name = "another-bin"
path = "another_bin.rs"
[dependencies.move-core-types]
git = "https://github.com/aptos-labs/aptos-core"
tag = "aptos-node-v1.15.2"
[package]
edition = "2021"
name = "another_package"
version = "1.0.0"
Note that another_bin.rs has a modified "Hello, world!" statement, which
relies on a random account address generated via the move-core-types
dependency:
use move_core_types::account_address::AccountAddress;
fn main() {
println!("Hello, {}!", AccountAddress::random());
}
Cache hit dynamics
To follow along, replicate the above workspace. Then generate a lockfile:
Note that this downloads the entire
aptos-core repository during the
--dry-run step, since a local crate index is required for the eventual
cargo chef cook operation:
=> [indexer 2/2] RUN cargo update --dry-run
However, if my_bin.rs is modified to instead print Hello, chef!, since the
aptos-coregit dependency
crate index is already cached, the repository does not need to be downloaded
again when re-building the image.
Here, the local image cache preserves the output for the --dry-run crate index
generation step, since the Cargo.toml manifest skeleton is common across both
builds in the workspace.
Moreover, updating another_bin.rs to print Goodbye, ... results in another
cache hit since there are no new dependencies.
Cache miss dynamics
The local crate index cache step can be undone by simply commenting out the
following line in the Dockerfile:
COPY --from=indexer $CARGO_HOME $CARGO_HOME
In this case, the cargo chef cook command has no access to a local crate index
cache, and it will need to regenerate it whenever a recipe changes. Notably,
this involves re-downloading
aptos-core
even for changes to my_package that have nothing to do with the dependency.
Crate index caching
This PR adds documentation on how to cache a local crate index when working with workspaces that have large
git
dependencies. I devised this method after I noticed that thecargo chef cook
step was cloning a non-targetgit
dependency (namely,aptos-core
) during acargo-chef
build, since compilation requires a complete local crate index.The documentation in the PR goes into detail about the mechanisms at play, and below I'm including an example for additional illustrative purposes.
Related:
Example
Layout
Consider the following workspace:
The top-level
Cargo.toml
file defines two packages:The
Dockerfile
is identical to the template proposed in this PR:The
Cargo.toml
formy_package
has no special dependencies:And
my_bin.rs
declares a simple "Hello, world!" statement:However, the
Cargo.toml
foranother_package
has agit
dependency onaptos-core
(note that peraptos-core
#8984 there is no plan to support package management oncrates.io
):Note that
another_bin.rs
has a modified "Hello, world!" statement, which relies on a random account address generated via themove-core-types
dependency:Cache hit dynamics
To follow along, replicate the above workspace. Then generate a lockfile:
To build and run
my-bin
viacargo-chef
:Note that this downloads the entire
aptos-core
repository during the--dry-run
step, since a local crate index is required for the eventualcargo chef cook
operation:However, if
my_bin.rs
is modified to instead printHello, chef!
, since theaptos-core
git
dependency crate index is already cached, the repository does not need to be downloaded again when re-building the image.To run
another-bin
:Here, the local image cache preserves the output for the
--dry-run
crate index generation step, since theCargo.toml
manifest skeleton is common across both builds in the workspace.Moreover, updating
another_bin.rs
to printGoodbye, ...
results in another cache hit since there are no new dependencies.Cache miss dynamics
The local crate index cache step can be undone by simply commenting out the following line in the Dockerfile:
In this case, the
cargo chef cook
command has no access to a local crate index cache, and it will need to regenerate it whenever a recipe changes. Notably, this involves re-downloadingaptos-core
even for changes tomy_package
that have nothing to do with the dependency.