rust-lang / cargo

The Rust package manager
https://doc.rust-lang.org/cargo
Apache License 2.0
12.63k stars 2.4k forks source link

`cargo vendor` misses some files #14034

Open Samasaur1 opened 3 months ago

Samasaur1 commented 3 months ago

Problem

I'm trying to build this Cargo project with vendored deps. I'm able to build it normally (cargo build), and running cargo vendor exits without an error, but building using the vendored deps fails (cargo build after modifying .cargo/config.toml).

It's pretty clear why it fails to build — cargo vendor hasn't copied the src/generate/templates directory of the tree-sitter-cli dependency, so I see a bunch of errors due to include_str! trying to include nonexistent files.

I'm not a Cargo expert, so I might be missing something, but I didn't see any reason why vendoring the dependencies would miss this subfolder. Hopefully you can point me in the right direction if there's something obvious I missed.

Steps

  1. git clone https://github.com/andrewtbiehl/kramdown-syntax_tree_sitter
  2. cd kramdown-syntax_tree_sitter/ext/tree_sitter_adapter
  3. RUBY=/path/to/bin/ruby cargo build (observe that this succeeds)
  4. cargo vendor (observe that this appears to succeed)
  5. Modify .cargo/config.toml as directed by the output of cargo vendor
  6. cargo build (observe that this fails)
  7. There will also be no folder vendor/tree-sitter-cli/src/generate/templates, although it does exist in the upstream repo

Possible Solution(s)

No response

Notes

No response

Version

cargo 1.76.0 (c84b36747 2024-01-18)
release: 1.76.0
commit-hash: c84b367471a2db61d2c2c6aab605b14130b8a31b
commit-date: 2024-01-18
host: aarch64-apple-darwin
libgit2: 1.7.1 (sys:0.18.1 vendored)
libcurl: 8.6.0 (sys:0.4.70+curl-8.5.0 system ssl:(SecureTransport) LibreSSL/3.3.6)
ssl: OpenSSL 1.1.1w  11 Sep 2023
os: Mac OS 14.5.0 [64-bit]
epage commented 3 months ago

As a quick sanity check

kramdown-syntax_tree_sitter is locked to tree-sitter-cli@0.22.6 https://github.com/andrewtbiehl/kramdown-syntax_tree_sitter/blob/3dd8be0b6fa22a7297247f2aec08149ad18c6423/ext/tree_sitter_adapter/Cargo.lock#L945-L947

docs.rs shows that tree-sitter-cli@0.22.6 does include that folder https://docs.rs/crate/tree-sitter-cli/latest/source/src/generate/templates/

epage commented 3 months ago

I just tried to reproduce this with cargo 1.78.0 (54d8815d0 2024-03-26) and vendor/tree-sitter-cli/src/generate/templates/ exists.

Samasaur1 commented 3 months ago

ack, I tried cargo 1.76 and 1.77. Let me try 1.78

Samasaur1 commented 3 months ago

@epage Still fails for me. Here's my new cargo version --verbose

cargo 1.78.0 (54d8815d0 2024-03-26)
release: 1.78.0
commit-hash: 54d8815d04fa3816edc207bbc4dd36bf18014dbc
commit-date: 2024-03-26
host: aarch64-apple-darwin
libgit2: 1.7.2 (sys:0.18.2 vendored)
libcurl: 8.6.0 (sys:0.4.72+curl-8.6.0 system ssl:(SecureTransport) LibreSSL/3.3.6)
ssl: OpenSSL 1.1.1w  11 Sep 2023
os: Mac OS 14.5.0 [64-bit]
Samasaur1 commented 3 months ago

This is still failing for me on Cargo 1.79:

cargo 1.79.0 (ffa9cf99a 2024-06-03)
release: 1.79.0
commit-hash: ffa9cf99a594e59032757403d4c780b46dc2c43a
commit-date: 2024-06-03
host: aarch64-apple-darwin
libgit2: 1.7.2 (sys:0.18.3 vendored)
libcurl: 8.6.0 (sys:0.4.72+curl-8.6.0 system ssl:(SecureTransport) LibreSSL/3.3.6)
ssl: OpenSSL 1.1.1w  11 Sep 2023
os: Mac OS 14.5.0 [64-bit]
ehuss commented 3 months ago

The problem is that src/generate/templates/cargo.toml exists (notice the lowercase c). Cargo will usually not include nested packages when packaging or publishing, and cargo vendor uses the same logic as cargo publish to determine which files to include. Cargo uses the presence of Cargo.toml to detect these nested packages. However, that detection depends on the case-sensitivity of the local filesystem. Presumably tree-sitter-cli was published by someone on a case-sensitive filesystem (like Linux) where the nested detection didn't work. However, when you run cargo vendor on a case-insensitive filesystem, it won't copy those files.

There are a few issues here:

cargo vendor doesn't exactly duplicate the .crate contents

I've long since felt that it would be best for cargo vendor to just extract the .crate file instead of trying to recompute what the file listing should be. I recall having discussions with someone about doing this (@weihanglo does that ring a bell)? I recall having a discussion about trying to add some kind of API to the Source trait to make that possible. But I can't find the PR or issue (I swear someone started working on it). There are some notes about this at https://github.com/rust-lang/cargo/issues/9575#issuecomment-860928843, and #9555 is kinda similar.

Cargo's logic depends on filesystem behavior

Generally, things can get messy when switching between filesystems that don't have the same case sensitivity. Cargo is often inconsistent on whether or not it is case-sensitive or not. For example, depending on using API's like Path("Cargo.toml").exists() which depends on the filesystem versus filename == "Cargo.toml" which is always case-sensitive.

I think it is unlikely that we can be 100% consistent in all places. However, it might be good to try to think about how we should approach this problem. Should cargo always try to be case-sensitive to avoid these cross-platform issues? Should this package-listing code be case-sensitive or not? If we change the behavior, how do we do it without breaking lots of projects?

ehuss commented 3 months ago

Oh, I remember now. It was https://github.com/rust-lang/cargo/pull/12509#issuecomment-1732415990. Unfortunately based on that comment, it sounds like it might not be easy.