geiger-rs / cargo-geiger

Detects usage of unsafe Rust in a Rust crate and its dependencies.
https://crates.io/crates/cargo-geiger
1.4k stars 65 forks source link

panic in .toml parser on some repos #83

Closed emilk closed 2 years ago

emilk commented 4 years ago

Version: cargo-geiger 0.9.0

$ cargo geiger
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: ()', src/libcore/result.rs:1165:5
stack backtrace:
...
  14: core::result::unwrap_failed
             at src/libcore/result.rs:1165
  15: cargo::util::canonical_url::CanonicalUrl::new
  16: cargo::core::source::source_id::SourceId::new
  17: cargo::util::toml::DetailedTomlDependency::to_dependency
  18: cargo::util::toml::TomlDependency::to_dependency
  19: cargo::util::toml::TomlManifest::to_real_manifest::process_dependencies
  20: cargo::util::toml::TomlManifest::to_real_manifest
  21: cargo::util::toml::read_manifest
  22: cargo::core::workspace::Packages::load
  23: cargo::core::workspace::Workspace::find_root
  24: cargo::core::workspace::Workspace::new
  25: cargo_geiger::cli::get_workspace
  26: cargo_geiger::main
  27: std::rt::lang_start::{{closure}}
  28: std::rt::lang_start_internal::{{closure}}
             at src/libstd/rt.rs:48
  29: std::panicking::try::do_call
             at src/libstd/panicking.rs:287
  30: __rust_maybe_catch_panic
             at src/libpanic_unwind/lib.rs:78
  31: std::panicking::try
             at src/libstd/panicking.rs:265
  32: std::panic::catch_unwind
             at src/libstd/panic.rs:396
  33: std::rt::lang_start_internal
             at src/libstd/rt.rs:47
  34: main
  35: __libc_start_main
  36: _start

it would be useful to get at least the line number in the .toml file that causes the problem.

EDIT: I tried eliminating lines in my Cargo.toml to find the offending code, but even with my whole Cargo.toml commented I got the above error. Weird. geiger works fine in some other repos.

anderejd commented 4 years ago

Thank you for the error report!

I'm guessing there's some new feature in cargo 0.41 that is needed to parse these files. My time is super limited at the moment, would you mind trying updating the cargo dependency to the latest version running the same test again?

emilk commented 4 years ago

On latest master of cargo-geiger I get a better error message: "couldn't get the path to cargo executable". I am trying to run cargo-geiger on a workspace with multiple subcrates and binaries. Running it on a subcrate with just one binary still results in the above error message.

The line is let rs_files_used = resolve_rs_file_deps(&copt, &ws).unwrap(); in run_scan_mode_default in cargo-geiger/src/cli.rs.

I attempted updating the cargo dependency (https://github.com/anderejd/cargo-geiger/pull/85) yet I still get the same error.

anderejd commented 4 years ago

Can you provide a minimal example project that reproduce the error?

emilk commented 4 years ago

What is the best way to approach that task? I just discovered that changing Cargo.toml and re-running cargo geiger does nothing if I don't also delete the Cargo.lock file. It took me way too long to discover that :/

anderejd commented 4 years ago

What is the best way to approach that task?

I'm guessing that you have a private project that you can't share, but if that assumption is incorrect and you can share the problematic project, that would be perfect. To create a minimal example would involve spending time to remove things piece by piece , one dependency and related code at a time, until the error goes away and then go back one step and try to remove other dependencies. So probably quite time consuming and I don't expect you to have that time, but it would be helpful to find this specific error.

I just discovered that changing Cargo.toml and re-running cargo geiger does nothing if I don't also delete the Cargo.lock file. It took me way too long to discover that :/

This is probably expected behavior from the cargo library. The lock file is expected to override the Cargo.toml, as far as I know.

Shnatsel commented 4 years ago

If I change Cargo.toml and then run cargo build, the Cargo.lock file will be automatically discarded and regenerated. I'd expect the same from running cargo geiger.

anderejd commented 4 years ago

This sounds like new issue, created one here: #91

emilk commented 4 years ago

I think the problem might be having a dependency on a private repo:

some_crate = { git = "ssh://git@github.com/secret/private.git", rev = "765dad2a94394c1f9b27a001ab1f0d301ab1f0d3" }

EDIT: yes! If I check out the repo above and do this: some_crate = { path = "/home/me/secret/private" } then cargo geiger suddenly works!

So there are probably three separate things that should be fixed:

1) cargo geiger probably fails to check out the private repo (even when cargo build succeeds). 2) The error message is missing ('called Result::unwrap() on an Err value: ()' should be replaced with Failed to read "ssh://git@github.com/secret/private.git") 3) Failure to read one dependency should not lead to geiger giving up. Instead it should print a (helpful) warning and move on.

pinkforest commented 2 years ago

Different errors here - cargo geiger 0.11.2

No reproduction anymore sadly (?) as of today -

I tried many combinations including the described and some weird ones including expected errors with various run combinations -

The errors today are pretty informative and verbose and I never hit the situation with panic and stack trace on my face.

We've bumped interfaces and cargo (to 0.58.0) quite a bit so it's quite different situation compared to a year ago

Also more important a lot of the panics / error messages have been cleaned up back in end of February that touched metadata stuff

Will close and re-open if this can be reproduced and we can define the underlying issue if there is still one.

Thanks for the effort :purple_heart:

pinkforest commented 2 years ago

@emilk - I think I found the exact code in Cargo where this failed and.. yes.. it's related to Git

Stackstrace had immediate ref to this cargo::util::canonical_url::CanonicalUrl::new

CanonicalUrl in turn uses url::Url

Cargo today - 0.58.0 or latest - does not panic anymore with Git Urls https://docs.rs/cargo/latest/src/cargo/util/canonical_url.rs.html

So doing a bit digging what historically caused this

They were working around https://github.com/servo/rust-url/issues/577 under PR https://github.com/rust-lang/cargo/pull/7787

Someone else noticed this error as well https://github.com/xd009642/tarpaulin/issues/319

There is quite bit of unwrap() and the underlying dep url :: set_scheme changed behaviour between 2.1.0 and 2.1.1 https://github.com/servo/rust-url/issues/577#issue-547714982

The unwrap() on cargo can be found that panic()'s https://docs.rs/cargo/0.41.0/src/cargo/util/canonical_url.rs.html#42

Canonical_url.rs was originally added on

Commit e54541225142b509f5f3b7468897b2c31a7635dc
Date:   Mon Sep 16 12:35:03 2019 -0700

However..

I suspect it is because Url::url::set_scheme() 2.1.1 changed to Err that was not handled in Cargo/canonical_url.rs https://docs.rs/url/2.1.0/src/url/lib.rs.html#1978-2008 https://docs.rs/url/2.1.1/src/url/lib.rs.html#2055-2104

I think the git:// change to https:// was now suddenly considered "special" to throw Err() by https://github.com/servo/rust-url/commit/7efdc53193adfdfd65c1d39bc7ad4762dd4c272b

        // If url’s scheme is a special scheme and buffer is not a special scheme, then return.
        if new_scheme_type.is_special() && !old_scheme_type.is_special() ||

So what did Cargo do? Well they removed Url::url::set_scheme and replaced it with url = format!("https{}", &url[url::Position::AfterScheme..])

Change here showing removal of set_scheme that was suddenly returning Err for unwrap() causing the panic on Git URL: https://docs.rs/cargo/0.41.0/src/cargo/util/canonical_url.rs.html#42 https://docs.rs/cargo/latest/src/cargo/util/canonical_url.rs.html#42`

And this was worked around in Cargo 0.43.0 - sadly it did not end up in Cargo 0.42.0 https://docs.rs/cargo/0.43.0/src/cargo/util/canonical_url.rs.html#42

Tracing the commit where it was fixed

Commit dde27346685e09166967616581aac383918b2c04
Date:   Fri Jan 10 08:00:35 2020 -0800

    Fix tests with `url` crate update

    Works around servo/rust-url#577

diff --git a/src/cargo/util/canonical_url.rs b/src/cargo/util/canonical_url.rs
index 37060bd54..c6f305279 100644
--- a/src/cargo/util/canonical_url.rs
+++ b/src/cargo/util/canonical_url.rs
@@ -39,7 +39,9 @@ impl CanonicalUrl {
         // almost certainly not using the same case conversion rules that GitHub
         // does. (See issue #84)
         if url.host_str() == Some("github.com") {
-            url.set_scheme("https").unwrap();
+            url = format!("https{}", &url[url::Position::AfterScheme..])
+                .parse()
+                .unwrap();
             let path = url.path().to_lowercase();
             url.set_path(&path);
         }
pinkforest commented 2 years ago

Future reminder to myself:

I need to go thru ecosystem the amount of Err(()) and then try to cross-map unwrap() calls to potential Err(())'s And if there are tons of references to stuff that Err's and then unwrap() somewhere on library level it might be valuable to keep track of list of this kind to reduce confusing panic() across the ecosystem and give visibility to devs as to amount of potential panic that can be caused with changes.