rust-lang / cargo

The Rust package manager
https://doc.rust-lang.org/cargo
Apache License 2.0
12.58k stars 2.39k forks source link

Walk `path` repository until dependency is found rather than requiring it to be at the repository root as done for `git` #13360

Open jpedrick opened 7 months ago

jpedrick commented 7 months ago

Problem

Currently, packages specified by git repositories will search the git project for sub-packages. The following will correctly find bevy_math

[dependencies]
bevy = { path = "/Users/jpedrick/Development/bevy", default-features = false }
bevy_math = { git = "https://github.com/bevyengine/bevy.git" }

However, with path specified packages this doesn't work:

[dependencies]
bevy = { path = "/Users/jpedrick/Development/bevy", default-features = true }
bevy_math = { path = "/Users/jpedrick/Development/bevy" }

Instead, bevy_math needs to be explicitly targeted with the path

[dependencies]
bevy_math = { path = "/Users/jpedrick/Development/bevy/crates/bevy_math/" }

I found this a bit confusing, as I expected crate imports with path to work the same as git.

Proposed Solution

The following would find the bevy_math package in /Users/jpedrick/Development/bevy/crates/bevy_math/ by searching for the Cargo.toml with package.name == "bevy_math"

[dependency]
bevy_math = { path = "/Users/jpedrick/Development/bevy" }

Notes

No response

epage commented 7 months ago

As this was a conscious choice (the Git Source is implemented as a wrapper around Path Source), I would very much want to understand why it was originally designed this way before making changes.

One potential reason is that path dependencies are expected to be used more often and recursively walking can be a performance problem.

For brainstorming purposes, as cargo is intended to be opinionated, I wonder if we could speed up that walking by enforcing good practices by (1) checking <path>/Cargo.toml and then (2) checking <path>/**/<package-name>/Cargo.toml. This would require a file system walk but wouldn't require us to read every manifest to extract package.name to see if its one we care about (we could also implement a partial manifest parser to extract the name but that duplicates knowledge, making it harder to maintain).

weihanglo commented 7 months ago

As this was a conscious choice (the Git Source is implemented as a wrapper around Path Source), I would very much want to understand why it was originally designed this way before making changes.

My guess is that the number of files in a Git repository is under control, while an arbitrary path might lead to an unbound walk through the filesystem.

To support this feature, Cargo needs to address the duplicate packages issue (such as https://github.com/rust-lang/cargo/issues/10752), which hasn't really got resolved for git dependency and is poorly documented.

https://github.com/rust-lang/cargo/issues/9624 is a relevant issue as well.

jpedrick commented 7 months ago

As this was a conscious choice (the Git Source is implemented as a wrapper around Path Source), I would very much want to understand why it was originally designed this way before making changes.

One potential reason is that path dependencies are expected to be used more often and recursively walking can be a performance problem.

For brainstorming purposes, as cargo is intended to be opinionated, I wonder if we could speed up that walking by enforcing good practices by (1) checking <path>/Cargo.toml and then (2) checking <path>/**/<package-name>/Cargo.toml. This would require a file system walk but wouldn't require us to read every manifest to extract package.name to see if its one we care about (we could also implement a partial manifest parser to extract the name but that duplicates knowledge, making it harder to maintain).

Thanks @epage for the background here. It makes sense to me that there is a simple path source.

I think I'd be good with a few different solutions, in order of increasing consequences:

1) A better error

As a newer user, I found this error message confusing/unhelpful when moving from a git source to a path source:

cargo build
error: no matching package named `bevy_math` found
location searched: /.../Development/bevy
required by package `bevy_xpbd_2d v0.3.2 (/.../Development/bevy_xpbd/crates/bevy_xpbd_2d)`

from the following dependency node in my Cargo.toml

[dependencies]
examples_common_2d = { path = "../examples_common_2d" }
bevy_math = { path = "/Users/jpedrick/Development/bevy", features = ["approx"] }

Something like the following might be more user friendly:

cargo build
error: no matching package named `bevy_math` found
location searched: /.../Development/bevy
required by package `bevy_xpbd_2d v0.3.2 (/.../Development/bevy_xpbd/crates/bevy_xpbd_2d)`
/.../Development/bevy is a cargo package, but "path" sources do not support searching for sub-packages. 
Consider trying: bevy_math = { path = "/.../Development/bevy/crates/bevy_math" }

2) Add a find_subpackage flag to dependency search nodes or provide the subpackage path directly:

[dependencies]
bevy_math = { path/git = ".../path/to/bevy", find_subpackage=true }

OR:

[dependencies]
bevy_math = { path/git = ".../path/to/bevy", subpackage_path="crates/bevy_math" }

And then deprecate with warnings the current default search behavior for git.

3) Highly opinionated sub-package structure: Require root level Cargo.toml. Sub-packages must be in <project_root>/crates/<subpackage_name>.

When defining dependencies you could validate the import by requiring the full path in the dependency node name:

[dependencies]
bevy::bevy_math = { path/git = ".../path/to/bevy" }

Alternatively, a less opinionated option would be to have the root Cargo.toml could have a mapping of sub-packages to paths:

[subpackages]
bevy_math = "crates/bevy_math"
bevy_winit = "crates/bevy_winit"

OR Supply a subpackage_root, which defaults to crates and assume the subdirectory names will match the package name:

[package]
subpackage_root = "my_crates"
jpedrick commented 7 months ago

@weihanglo seems to me walking the file tree or finding the wrong subpackage due to duplicate crate names could be completely avoided by having an opinionated structure or by requiring users to provide the subpackage mapping, whether in the source crate Cargo.toml or as an attribute on the dependency node.

As this was a conscious choice (the Git Source is implemented as a wrapper around Path Source), I would very much want to understand why it was originally designed this way before making changes.

My guess is that the number of files in a Git repository is under control, while an arbitrary path might lead to an unbound walk through the filesystem.

To support this feature, Cargo needs to address the duplicate packages issue (such as #10752), which hasn't really got resolved for git dependency and is poorly documented.

9624 is a relevant issue as well.

epage commented 7 months ago

A better error

I think this could be worthwhile to do so people have an improved experience until and if we decide to do any of the other steps

btw the other steps remind me of some of the discussions around https://github.com/rust-lang/rfcs/pull/3529.

jpedrick commented 7 months ago

A better error

I think this could be worthwhile to do so people have an improved experience until and if we decide to do any of the other steps

btw the other steps remind me of some of the discussions around rust-lang/rfcs#3529.

I agree Perhaps we create a issue just for that item