rust-lang / cargo

The Rust package manager
https://doc.rust-lang.org/cargo
Apache License 2.0
12.58k stars 2.39k forks source link

Option to ignore missing workspace members when building #14566

Open jephthia opened 1 day ago

jephthia commented 1 day ago

Problem

Given the following virtual workspace:

[workspace]
resolver = "2"

members = [
  "apps/bin1",
  "apps/bin2",
  "apps/bin3",
  "libs/libA",
  "libs/libB",
  ...
]

Building bin1 with cargo build -p bin1 should be allowed even if bin2 is missing from the file system.

Use case: In docker when building a binary, only the relevant files are sent to the build context. For example, consider the following ignore file for building bin1

# Exclude everything
*

# Include relevant files only
!apps/bin1
!libs
!Cargo.toml
!Cargo.lock

The binary is then built with: cargo build -p bin1 --locked --release

This fails with error: failed to load manifest for workspace member /code/app/bin2 because the other binaries which are members of the workspace aren't present, but they are irrelevant to this build so should not be needed.

Proposed Solution

An option to prevent a build from failing because of a missing member. This could be done either:

through a flag: cargo build -p bin1 --ignore-missing-members (with a better flag name)

or directly in the manifest?

[workspace]
ignore-missing-members = true

Notes

No response

weihanglo commented 23 hours ago

Every member is relevant and contributes to the workspace's Cargo.lock. Removing any of them may affect the dependency resolution.

Similar feature requests have been made:

and #6179 is the major one tracking it.

That being said, this docker use case is slightly different from others. Could you expand a bit about your workflow? Also wonder where the ignore file came from.

epage commented 20 hours ago

Most of those conditional ones are for dealing with targets, cfgs, etc. This is related to docker caching from what I understoof of the text. Docker fingerpints the inputs to a layer and rebuilds if they change. If you have a layer that is meant for only a subset of your dependency tree, then you only want that fingerprinted. This is more similar to #2644 but in the workspace, rather than dealing with dependencies.

https://crates.io/crates/cargo-chef is the primary place for docker layer caching experiments. I'd recommend seeing if that meets or your needs or if it could be made to meet your needs.

jephthia commented 17 hours ago

That being said, this docker use case is slightly different from others. Could you expand a bit about your workflow? Also wonder where the ignore file came from.

Let's say we have a virtual workspace with just two binaries, server1 and server2

members = [
  "apps/server1",
  "apps/server2",
]

and the following file structure:

project/
├── apps/
│   ├── server1/
│   │   ├── src/
│   │   │   └── main.rs
│   │   └── Cargo.toml
│   └── server2/
│       ├── src/
│       │   └── main.rs
│       └── Cargo.toml
├── config/
│   └── docker/
│       ├── dev/
│       └── prod/
│           ├── Server1.Dockerfile
│           ├── Server1.Dockerfile.dockerignore
│           ├── Server2.Dockerfile
│           └── Server2.Dockerfile.dockerignore
├── Cargo.toml
├── Cargo.lock
└── docker-compose.yaml

Docker is used to build and run these binaries, the first workflow is for production and the second is for local development (both suffer from the same issue)

To build for production the following is used: File: config/docker/prod/Server1.Dockerfile

FROM rust:1.80.1-bullseye AS build

RUN --mount=type=bind,source=apps/server1,target=apps/server1,rw \
    --mount=type=bind,source=Cargo.toml,target=Cargo.toml \
    --mount=type=bind,source=Cargo.lock,target=Cargo.lock \
    --mount=type=cache,target=target/ \
    --mount=type=cache,target=/usr/local/cargo/registry/ \
    <<EOF
set -e
cargo build -p server1 --locked --release <--- fails: "missing server2 member"
EOF

...

File: config/docker/prod/Server1.Dockerfile.dockerignore

# Ignore everything
*

# Include relevant files only
!apps/server1
!Cargo.toml
!Cargo.lock

Attempting to build this with docker build -t server1 -f config/docker/prod/Server1.Dockerfile . will fail since the source of server2 aren't included in the build context.

For the local development workflow, it's similar but with docker-compose.yaml being used instead:

  server1:
    build:
      context: .
      dockerfile: config/docker/dev/Server1.Dockerfile
    working_dir: /code
    volumes:
      - ./apps/server1:/code/apps/server1
      - ./Cargo.toml:/code/Cargo.toml
      - ./Cargo.lock:/code/Cargo.lock
      - server1_cargo_target:/code/target/
      - server1_cargo:/usr/local/cargo
    ports:
      - "8080:8080"
    command: ["cargo", "watch", "-x", "run", "-p", "server1"] <--- Fails: missing server2 member

Running docker compose up server1 fails.


This is related to docker caching

Ah caching isn't the concern for this, (we've long given up on getting caching to work correctly in CI 😅 at least for now) We can accept slower builds, as long as it at least builds, but with this issue, the builds aren't even possible.

This came up because our current setup does not make use of cargo workspaces and instead treats each binary separately with their own target/ and Cargo.lock but as the monorepo grew, this became less than ideal so we decided to switch to cargo workspaces but hit this issue.

Are there issues that we're not considering that would come out of building a specific binary in a workspace without the presence of other binaries? Our thinking was that this wouldn't cause issues since each binary is unrelated to the other.

https://crates.io/crates/cargo-chef is the primary place for docker layer caching experiments. I'd recommend seeing if that meets or your needs or if it could be made to meet your needs.

Since caching isn't the issue for this we're a bit hesitant on taking on a dependency to solve this. As a workaround we're currently thinking of having a config/cargos/ directory that would store different Cargo.toml files for each necessary binary (e.g Cargo-server1.toml, Cargo-server2.toml) and during the mount, we'd pick the specific Cargo-xyz.toml needed for the build. But this extra maintenance would feel like a hack that goes against the point of a workspace.

epage commented 16 hours ago

So this would also be a problem for caching iiuc, even if that isn't your problem.

This came up because our current setup does not make use of cargo workspaces and instead treats each binary separately with their own target/ and Cargo.lock but as the monorepo grew, this became less than ideal so we decided to switch to cargo workspaces but hit this issue.

Maybe I missed it but could you expand on why you don't mount the entire source?

jephthia commented 15 hours ago

So this would also be a problem for caching iiuc, even if that isn't your problem.

Hmm yes, most likely

Maybe I missed it but could you expand on why you don't mount the entire source?

This is a recommended docker best practice as far as I know, we have a monorepo which contains many unrelated things, android app, ios app, non-rust libraries, many rust binaries, etc. Our common workflow is to simply exclude everything and only include the project being built.

From: https://docs.docker.com/build/concepts/context/#dockerignore-files

This helps avoid sending unwanted files and directories to the builder, improving build speed, especially when using a remote builder.