rust-secure-code / wg

Coordination repository for the Secure Code Working Group
149 stars 10 forks source link

Safety-oriented static analysis tooling #22

Open Shnatsel opened 5 years ago

Shnatsel commented 5 years ago

Clippy is great heuristic tool, but does not have many safety or security analysis lints yet. We should extend it with anti-patterns we discover. This is tracked as #24 #27.

However, there is also a need for safety-oriented sound (as opposed to heuristic) tooling. There are currently two projects working in that direction (ignoring formal proof systems that are too cumbersome to use):

  1. Prusti for proving properties such as absence of overflows on unmodified Rust code
  2. MIRAI for static analysis of unsafe code based on abstract interpretation

Both are not usable on real-world code yet and could use some help.

vakaras commented 5 years ago

Thanks for referencing the tools! I would like to clarify their aims and capabilities:

Prusti for proving properties such as absence of overflows on unmodified Rust code

Prusti (project page) is a verifier that aims to support verification of functional properties such as that function max(a, b) returns a value that is not smaller than any of its arguments. While it can be used for proving the absence of overflows, an abstract interpretation based analyser such as MIRAI would be probably more suitable because it has smaller overhead for its user.

MIRAI for static analysis of unsafe code based on abstract interpretation

MIRAI cannot handle unsafe code, and it is unclear how to extend MIRAI to handle it. On the other hand, Viper–the back-end used by Prusti–is designed for verifying concurrent code that manually manipulates memory. Therefore, Prusti would be more suitable than MIRAI for verifying unsafe code.

tarcieri commented 5 years ago

Clippy is great, but is not really geared towards safety or security analysis.

clippy is a great thing to lean on though. In addition to improving clippy's existing security/restriction lints, @manishearth says he'll accept clippy security PRs for widely used ecosystem crates. I'd look to something like gosec for security lints which could be implemented in terms of clippy.

Something I've daydreamed about is a security-oriented static analysis integration suite. Right now I think there are a lot of small tools which do one thing and do it well, but it'd be really nice if there were a way to drive several small security lint tools in unison with an uncomplicated setup process and centralized configuration.

I'm thinking of something along the lines of brakeman in the security space, or gometalinter as a more general lint suite.

I think it'd be particularly useful for the purposes of auditing usages of unsafe. You can imagine a central policy of what unsafe behaviors dependencies (and their transitive dependencies) are allowed to have, driving a combination of cargo-geiger, cargo-crev, etc.

Manishearth commented 5 years ago

@shnatsel where are you getting "clippy isn't oriented towards security analyses" from? We do multiple kinds and don't have an aversion to security analyses.

Shnatsel commented 5 years ago

@Manishearth frankly mostly because I've never seen it yell at me that I was doing something insecure. It might be a completely wrong impression. Is it the case that Clippy is cool with security-related lints, but doesn't have lots of them yet?

Thanks for calling me out on this by the way, I'll update the post.

Manishearth commented 5 years ago

It has a couple security-ish lints, and a lot of the restriction (opt in) lints are useful for security purposes, but in general security practices aren't yet crystallized for rust yet so there's not much it can do there. Like, this WG deciding on things is a prerequisite, because there's a limit to what can be linked (e.g. we can't do global analyses) so the lints have to be evolved in tandem with the crates.

Manishearth commented 5 years ago

Chatted with @bascule out of band about this.

From Clippy's point of view, we're willing to add lints about popular crates (e.g. there's a limit about passing invalid regexes to the regex crate). So if there's thing that's almost unambiguously (false positives are okay, just not too many) insecure with Diesel, we can lint that. If you have a lot of false positives it can be turned into an opt-in lint.

If y'all standardize on crates that help provide security boundaries (e.g. an Untrusted<T> wrapper type that's annoying to unwrap, or a Trusted<T> type that's annoying to construct) we're happy to add lints that deal with them.

Instead of standardizing on crates we can also standardize on attributes, e.g. #[clippy::lint_on_not_const] can be used with Trusted::new() to enforce that its arguments must be consts. Others can use it too.

tarcieri commented 5 years ago

The untrusted crate contains some wrapper types for this purpose, namely a wrapper for &'a [u8] (i.e. untrusted::Input) which might be useful for certain cases of this sort of thing. Namely this crate also provides a set of panic-free operations for acting on / parsing these slices as well.

Not sure what @briansmith thinks of using this crate for general purpose marker types though. My understanding is due to some ergonomics issues with the wrapper types, he's moving away from using this way in ring, but perhaps he has thoughts.

tarcieri commented 5 years ago

I opened #27 to discuss adding security-oriented restriction lints to clippy (which unfortunately appears to be a dup of #24, whoops)

tarcieri commented 5 years ago

Regarding this:

Something I've daydreamed about is a security-oriented static analysis integration suite. Right now I think there are a lot of small tools which do one thing and do it well, but it'd be really nice if there were a way to drive several small security lint tools in unison with an uncomplicated setup process and centralized configuration.

I'm thinking of something along the lines of brakeman in the security space, or gometalinter as a more general lint suite.

What would people think of expanding cargo audit to do more than just lint Cargo.lock against the RustSec Advisory DB?

It could potentially run additional static analysis tools (in parallel!), like the clippy security lints, cargo-geiger, Prusti/MIRAI , etc.

alex commented 5 years ago

cargo audit as a frontend to running cargo clippy --security seems ok to me, but I think the functionality should live in clippy.

tarcieri commented 5 years ago

Absolutely! The idea wouldn't be to duplicate these tools, but make it easy to run several of them on a single codebase (in parallel, and with a unified configuration).

And yes, for clippy I think it would just shell out and collect the results via stdout, ideally in some structured format

Shnatsel commented 5 years ago

I feel a single tool doing everything at once would be overwhelming. It would spit out so many warnings that people would just ignore all of them. The fact that some tools have greater false positive rates than others doesn't help.

That said, we do have a problem with discoverability of tooling. This is tracked as #25

Manishearth commented 5 years ago

That's what lint groups are for, clippy can turn off a bunch of these by default and your tool can turn them on wholesale easily.

Like, this is a very normal problem clippy has already, and we've solved this already.

You really don't want to try maintaining a separate linting library unless you can also get it to be included in the rust build process (unlikely).

tarcieri commented 5 years ago

I feel a single tool doing everything at once would be overwhelming.

I mentioned a few "security suite" static analysis tools I've used in the past with great success:

Brakeman

A popular(!) security-oriented static analysis tool for Ruby/Rails, which was further developed into a commercial offering (which it seems was recently acquired).

In the same way I looked to RubySec when creating RustSec, I think Brakeman serves as a successful and popular example of this sort of tool, and does pretty much what I was describing before, which is its own set of clippy-like AST-based scans, and also checking the project against the RubySec advisory database.

The important features I think keep Brakeman from being overwhelming:

gometalinter

Go has famously fast compile times, but to achieve that it doesn't check many program properties which could usefully prevent bugs in practice. Lots of "do one thing and do it well" tools popped up for many interesting cases (e.g. the equivalents of things like #[must_use] and exhaustiveness checking in switch statements).

gometalinter provides a convenient way to drive multiple static analysis tools for Go projects. It lets you select which tools you want to use, manages the workflow of using them, and runs them concurrently for better performance.

While not a security-oriented tool per se, I think it's another example of a successful "static analysis suite"

tarcieri commented 5 years ago

You really don't want to try maintaining a separate linting library unless you can also get it to be included in the rust build process (unlikely).

I sure don't, and everyone already has clippy! I'd definitely rather lean on clippy for that sort of thing, and then just shell out to it to drive it.

briansmith commented 5 years ago

Not sure what @briansmith thinks of using this crate for general purpose marker types though. My understanding is due to some ergonomics issues with the wrapper types, he's moving away from using this way in ring, but perhaps he has thoughts.

"untrusted" vs "trusted" is too simplistic. We need additional notions like "unparsed" vs "syntactically valid" vs "syntactically valid and with a claim of provenance that hasn't been verified" vs. "trusted for a given set of uses." In addition, even when a binary distinction makes sense, it is problematic to use untrusted::Input as the input type, because if you have 5 input parameters, then all give would have the same type (untrusted::Input), and it's likely that somebody will mix them up. As I refactor my crates' APIs, I'm replacing the use of untrusted::Input with more specific types that take into account the roles and more nuanced characterizations of the inputs' validity. But I'm still using untrusted for parsing variable-length (binary) inputs.