rust-lang / cargo

The Rust package manager
https://doc.rust-lang.org/cargo
Apache License 2.0
12.56k stars 2.38k forks source link

Want config option for definitively specifying local crate paths #6713

Open ijackson opened 5 years ago

ijackson commented 5 years ago

Problem

It is very usual when doing development to have local unpublished dependencies. Also, when cargo is being run as part of some larger build system, the larger build system typically wants to control what dependencies are used.

cargo does not offer a good way to handle this situation. The config option paths looks like it would be good for this. However, it does not work if the crate is not in the registry (or if the metadata in the registry is not appropriate for the local version).

It is of course possible to edit Cargo.toml in each crate to specify the local pathnames of all these local crates. But this is not good:

This last problem is quite severe. It interferes with proper use of version control. The workaround for that involves committing paths on the local computer to git. This is of course quite unclean and tends to cause these local paths to leak out of their appropriate context (which was the computer they started on).

i guess right now many people commit these local paths to git, presumably meaning to strip them out later. AIUI this has even resulted in a workaround on crates.io which according to some sources now strips path entries out of Cargo.toml during the publication process.

Proposed solution

What I think is needed is an option like paths but which is processed before the registry is consulted, before the dependency graph is processed, and so on. The information in the new config option should completely override anything in the registry or in any Cargo.tomls.

This should work even if the registry information is unparseable - that way this can be used as a workaround for corrupted information in the registry.

Ideally the new option should permit specifying the fallback behaviour on a per-crate basis (ie, what happens if for any particular purpose none of the new_path crates are suitable: either an error, or falling back to the registry).

For reference, here is the most useful stackexchange question which seems to discuss this: https://stackoverflow.com/questions/33025887/how-to-use-a-local-unpublished-crate

Nightmarish workaround

In the meantime, I have worked around this problem with an absolutely horrific shell script.

My shell script is a wrapper for cargo. On each invocation it reads a file containing a list of crate names and paths, and edits (using sed!) all the Cargo.toml's to specify the right path. It then runs cargo. When cargo is done, it puts everything back, so your git trees are all left clean (although they were dirtied during the build).

I have attached the script, in case it is useful to anyone else. It is also here: https://www.chiark.greenend.org.uk/ucgi/~ianmdlvl/git?p=nailing-cargo.git;a=blob;f=nailing-cargo;h=9bf28b57d21212edeefebddb52b2d637f18fe1f3;hb=HEAD

The seddery is a particularly bad feature of this script. It worked for me with the crates I cared about. Also the script is full of clone-and-hack and its fd handling is too ad-hoc. I would love to throw this thing away.

nailing-cargo.txt

joshtriplett commented 5 years ago

(Note: edited this comment to expand it and make it clearer.)

Another alternative here would be the [replace] section (deprecated, use [patch] instead) in your crate, which you can use to replace bits of the dependency graph, and you'd only have to edit the top-level crate, not any of the dependencies. That section gets automatically get stripped out when you publish your crate to crates.io.

As another alternative, you could create your own directory registry: make a directory containing a symlink farm to the directories in /usr/share/cargo/registry, and then add symlinks in that directory to your local crates. As far as cargo is concerned, those local crates will then be in the registry.

Please let me know if one or the other of those works for you.

(The eventual thing you'll want is alternative/custom registries, which will let you have a local registry with completely different packages that gets consulted before crates.io. That isn't quite there yet, but it's close.)

ijackson commented 5 years ago

Hi, thanks for looking at my feature request.

Firstly I think I want to reframe this conversation. I get the impression that you are trying to help me solve my practical problem. Thank you. That is kind.

However, my purpose for filing this feature request was not to get help. It was to improve the experience for other users of cargo.

I think these linked use cases "local unpublished crate" and "override crate from crates.io" are very important and that doing both or either of these things should be super convenient and easy.

(I come at this from a Free Software background. I expect that a Free Software project will make it as easy as possible for me to locally modify the software I am using; also being able to work locally is important for a lot of practical reasons.)

I think that using a local unpublished crate, or overriding a crate from crates.io, should be easily possible:

This is not a problem that is unique to me. My attempts to solve it for myself involved a lot of reading and I found that this is a common question on online fora.

The usual answer is to modify appropriate Cargo.toml's to add path overrides. This meets all the other requirements. But it is undesirable for the reasons I have explained. We should not be creating a situation where users are expected (and encouraged) to commit, to their crates' git branches, directory paths on their own computer.

I don't think any of your suggestions below meet these requirements. A new configuration parameter (a table, probably) is needed.

Josh Triplett writes ("Re: [rust-lang/cargo] Want config option for definitively specifying local crate paths (#6713)"):

I think you can do this using the [replace] section in your crate ... you'd only have to edit the top-level crate, not any of the dependencies.

Ah, I didn't know about that. Using that would somewhat simplify my workaround script. However, it is still needed because I don't want local paths in any git branch.

And, you are assuming that there is one top-level crate, which is not really true. For example, in a recent project I found I had to fix a bug in one of my dependencies so I not only needed to make my own crate point to the local copy of the dependency, but also build the dependency itself for its tests etc. before submitting an MR.

With your scheme that would have meant committing [replace] to the dependency's Cargo.toml, and working there with a branch containing both commit(s) for the crate's upstream, and those edits to make it build in my environment. Quite a nuisance. It risks me embarrassing myself, by sending my local nonsense commit upstream.

That section gets automatically get stripped out when you publish your crate to crates.io.

I think the need for this in crates.io just demonstrates why this is a bad idea. Ideally crates.io would have rejected such things rather than laundering them, since their presence is clearly a bug. People who get the actual source code from gitlab or whereever should not get these wrong overrides either.

The problem is that introducing this kind of bug in one's Cargo.toml is by far the easiest way of handling this very common use case.

As another alternative, you could create your own directory registry: make a directory containing a symlink farm to the directories in /usr/share/cargo/ registry, and then add symlinks in that directory to your local crates.

I considered that this might be possible but I decided it was probably too complex. Also, the best documentation I could find for how to do something like that is this:

https://doc.rust-lang.org/cargo/reference/source-replacement.html

| A directory source is just a directory containing a number of other | directories which contain the source code for crates (the unpacked | version of *.crate files).

But of course our user has no idea what a *.crate file is like. I can't seem to find the documentation for that in the Cargo Reference. I do find this in the section on Directory Sources:

| Each crate in a directory source also has an associated metadata | file indicating the checksum of each file in the crate to protect | against accidental modifications.

That is obviously going to be a nuisance. The user wants to be able to modify any of their local crates' source code and rebuild, quickly and easily, just by running `cargo build'.

Stepping back slightly, this `directory source' looks like quite a serious and complicated thing to be doing. But our user is trying to solve a very common problem. Like me, our user will probably reject this as looking too complicated.

Please let me know if one or the other of those works for you.

Well, as I say, this is not really about me. If it were just about me I would have just kept using my wrapper script and got on with my life.

This is about all the users who want to do this apparently simple thing. And no, for the reasons I have explained; those solutions would not work for me (and I did already consider and reject many of them); and I don't think these solutions work for others either.

Ian.

-- Ian Jackson ijackson@chiark.greenend.org.uk These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is a private address which bypasses my fierce spamfilter.

mattheww commented 5 years ago

I agree with the submitter that it's worth providing a way to override Cargo.toml without editing it.

I've been using git smudge/clean filters to avoid putting local configuration in the committed Cargo.toml, and I don't think that sort of thing ought to be needed for something as common as working with a not-yet-released library.

fanzeyi commented 5 years ago

We are now facing a very similar problem as described in this issue. We need to override a local unpublished crate due to the use of different build systems.

The background is that we are building Rust with three different build systems (Cargo itself, one calls into Cargo, and the other one don't use Cargo at all -- directly uses rustc), and the one uses Cargo under the hood is re-arranging the directory structure for other purposes before invoking Cargo. This causes a problem because it would create a different directory structure comparing to the structure when we build directly with Cargo. In the Cargo.toml, we are specifying the location to that crate with relative path. Since it is an unpublished crate and specified in a form of crate = { path = "../../../path/to/crate" }, we are unable to override its path other than making the build system editing the Cargo.toml directly -- which isn't a sustainable solution and we'd love to avoid doing. (Cross-platform compatibility is also something we need so it couldn't be a shell script doing the editing :/)

I tried to use [patch] [replace], paths in .cargo/config and making a local registry as suggested in the comment above, but none of this work because it is a local non-published crate.

I want to know if the Rust/Cargo team would accept such changes to allow overriding paths to local unpublished crates. I will be in RustConf this week in Portland, and we definitely want to chat about this in person if possible.

ijackson commented 5 years ago

Zeyi Fan writes ("Re: [rust-lang/cargo] Want config option for definitively specifying local crate paths (#6713)"):

I want to know if the Rust/Cargo team would accept such changes to allow overriding paths to local unpublished crates. I will be in RustConf this week in Portland, and we definitely want to chat about this in person if possible.

I won't be at RustConf. But I would be very happy if RustConf could be an opportunity to find a way to improve this area.

Ian.

-- Ian Jackson ijackson@chiark.greenend.org.uk These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is a private address which bypasses my fierce spamfilter.

kankri commented 4 years ago

I'm trying to find a good workflow for working with either unpublished or locally modified crates. cargo-edit-locally seems to be (partially) trying to solve a similar issue, but it is also making temporary edits to Cargo.toml.

What if cargo would support Cargo.local.toml which you could put in your .gitignore? This file would contain local/temporary overrides to Cargo.toml edited by hand or by a tool like cargo-edit-locally.

xpepermint commented 4 years ago

I imagine this feature to work as npm link.

Eh2406 commented 4 years ago

The eventual thing you'll want is alternative/custom registries

Note that this is stable now.

AndreKR commented 4 years ago

The eventual thing you'll want is alternative/custom registries

Note that this is stable now.

Private registries are great (actually absolutely essential IMO) but I'm not quite sure how to use them in the situation where you want to have unpublished changes in one crate (which can be either completely unpublished yet or be published but without the changes) and you want to try out those changes in another project. Wouldn't you have to publish (to your local registry), change a few characters, publish again, change something again, and so on?

joshtriplett commented 4 years ago

@AndreKR Directory registries let you just point Cargo to a directory full of (potentially modified) packages and have Cargo use them; that directory could be your local development directory.

@fanzeyi Can you please clarify how you tried to use [patch] and [replace], and how that didn't work?

AndreKR commented 4 years ago

@joshtriplett That would be perfect. (That's how it works with Composer and it used to work with Go (before modules).) How would one go about configuring this? I only found https://doc.rust-lang.org/cargo/reference/registries.html and according to that you need a Git repository with an index file and you have to actually publish to that "registry" to get the packages there.

joshtriplett commented 4 years ago

@joshtriplett That would be perfect. (That's how it works with Composer and it used to work with Go (before modules).) How would one go about configuring this?

https://doc.rust-lang.org/cargo/reference/source-replacement.html#directory-sources

ijackson commented 3 years ago

As I mentioned in https://github.com/rust-lang/cargo/issues/6713#issuecomment-469696481 the docs for directory registries say this:

Each crate in a directory source also has an associated metadata file indicating the checksum of each file in the crate to protect against accidental modifications.

That does seem to suggest that

Wouldn't you have to publish (to your local registry), change a few characters, publish again, change something again, and so on?

Is that wrong? In that comment I also explained some other problems that I see with trying to use directory registries to solve this problem.

joshtriplett commented 3 years ago

You don't actually have to use the checksum. It's quite possible to have a directory for all your dependencies, use that directory as a directory registry, and modify the sources in it.

While that's not something that should be used for large-scale package forking for many people (e.g. distributions of packages), it's absolutely usable for local experimental patching (e.g. "see if something works before it goes upstream").

WeirdConstructor commented 3 years ago

I got the same problem here, but with crates that I only published to some git repository. Currently my crates are not ready for publishing, and so I specify dependencies by their git repository (and branch). I even got a few git dependencies where I work on a fork of some other crate. But I work with path = ..., because those paths contain the most recent changes without running cargo update. Changes on the main project often need new features in my dependencies, and switching tabs in an editor should not be interrupted by a git commit (with potentially broken changes) and a cargo update.

So I end up manually commenting in and out dependencies in my Cargo.toml, which leads to me pushing a Cargo.toml with local paths in it, just to eventually push a change to fix those dependencies.

This case is not covered by a local repository.

I would much rather have a config file somewhere that does not get pushed, that lets me override this. A Cargo.local.toml or ~/.cargo* would be fine for overriding where to find a dependency.

Dessix commented 3 years ago

Rust doesn't really have a story for local configurations of contents- this leads to us not actually shipping a Cargo.toml for the root workspace, instead providing a template and generator for setting up your local environment a la CMake, then running into the various issues that this solution precipitates.

Having files that are explicitly "not to be committed" for local configuration has been popularized for its utility by the DotNet, Python, and JS ecosystems, the majority of IDEs, and even git itself. Cargo currently lacks it, but it doesn't seem to present anything as a viable alternative- this isn't a case of "this other process is better", but a vacuum where it feels a solution should exist.

I shouldn't have to modify checked-in files and --assume-unchanged to alter my local tree, nor should I be unable to alter the shape of the tree that is resolved from my project's perspective without pushing binaries to a website.

w4ll3 commented 2 years ago

I'm facing a similar issue, actually #8747 would have solved my problem but I understood it's implication, a suggestion I would have would be something like a flag to fall back, e.g

[dependency.example]
fallback=true
path="../example"
version="0.1"
git="https://github.com/example"

This way it would follow the locations in order (path, crates, github).

The repository I wanted for this behavior to happen is DIDKit where we depend on another crate SSI, for developing new features we usually need to make changes at SSI but for the end-user it's supposed to just build DIDKit, that simply doesn't work without having both repositories cloned.

ijackson commented 2 years ago

You don't actually have to use the checksum. It's quite possible to have a directory for all your dependencies, use that directory as a directory registry, and modify the sources in it.

I don't understand what you mean, when you say "You don't actually have to use the checksum". The checksum is surely used by cargo, not by me. I am presumably supposed to provide it. So maybe you meant "You don't actually have to provide the checksum". But I already quoted the documentation which says "Each crate in a directory source also has an associated metadata file indicating the checksum of each file in the crate to protect against accidental modifications.". That doesn't sound optional. Buit I already quoted this documentation and you haven't said it's wrong despite me explicitly asking that question. So I feel this conversation is going round in circles.

Also the docs say "Cargo has a core assumption about source replacement that the source code is exactly the same from both sources. Note that this also means that a replacement source is not allowed to have crates which are not present in the original source." and you say "all your dependencies" (emph. mine). But of course one wants to mix crates.io with one's own local changes, which it seems to me would violate that documented assumption and not fit into your "all".

As an aside, the docs keep talking about ".crate files" and "the unpacked version of *.crate files". I assume that "the unpacked version of [a] .crate file" is just like the normal source tree for a cargo project. I looked for documentation for this (searching for "crate format" in the cargo book, for example) but it doesn't seem to be written down anywhere.

While that's not something that should be used for large-scale package forking for many people (e.g. distributions of packages), it's absolutely usable for local experimental patching (e.g. "see if something works before it goes upstream").

As I have explained, as far as I can tell from the documentation source replacement cannot (or should not) be used for this, in the general case (which includes adding new crates, and modifying existing crates, and combining those with unmodified crates from crates.io).

Profpatsch commented 1 year ago

I need a similar feature, I want to provide the crates I depend on via the nix package manager.

rustc gives me the ability to add crates to the library search path via -L, but cargo seems to ignore any such flags and will nonetheless try to download the crate from the internet. This is unacceptable.

Compare the Haskell packages manager cabal: It will query ghc-pkg for any libraries that already exist in the package database, and if it can find the right libary there it will not use its registry.

epage commented 10 months ago

Tools discussed

Related issues


For the original issue, it was later clarified in a followup comment, this is meant to be a user experience report to guide improvement on common workflows:

without having to commit changes.

[replace] was brought up but had the following problems

Whats being discussed sure sounds like using [patch] tables in config files. If replace was almost there, then patch in config seems like it fills the last of the gap. @ijackson does this solve the need? Of course having tools to manage all of this would be great. See some of the related issues for further ideas on improving it.

Depending on the feedback, I feel like this can be closed as resolved and further improvements can be handled in the related issues.


Another user commented about a build system shuffling packages around that use path dependencies. This falls flat with many of the solutions because they don't work on the textual level but on being able to resolve the dependencies and then remap them.

This sounds like a very different use case, something that would likely need its own issue. However, it seems fairly specialized that it seems strange to shift the burden of the limitation of that build system onto cargo rather than it resolving it.


Another user commented on wanting a solution to help with nix. Again, this sounds different enough from OP that it likely should be discussed on its own if something above does not resolve it. It sounds like they are wanting to arbitrarily, implicitly override the registry within a users account. To me, this sounds like it runs counter to wanting to verify packages are what is intended and I've seen a lot of downsides to projects intermingling their concerns in a user-wide database.

codedcosmos commented 6 months ago

I'm facing a similar issue. Instead I don't publish to crates.io at all, only to private git repositories. My imagined approach involves some statekeeping on cargos part. I think it would be useful if the Cargo.toml kept the two sources for the dependency but a cargo command could be used to switch the state locally. For example:

(Cargo.toml file)

[dependencies]
some_repo = { git = "ssh://git@onlinegit.com/codedcosmos/some_repo.git" }

[replace]
some_repo = { path = "../some_repo" }

Or maybe if you prefer this kind of syntax

[dependencies.some_repo]
git = "ssh://git@onlinegit.com/codedcosmos/some_repo.git" 
replace = "../some_repo" 

# [dependencies]
# some_repo = { git = "ssh://git@onlinegit.com/codedcosmos/some_repo.git" , replace = "../some_repo" }

By default cargo run/test/etc would use the git repo. But if you run the following command:

cargo replace some_repo

Cargo would use the replace value instead. Note that this only applies to your local repo. You wouldn't have made any changes that if pushed to git by running that command.

cargo reset some_repo

Could possibly be used to restore original behavior.