rust-gamedev / wg

Coordination repository of the Game Development Working Group
509 stars 10 forks source link

Make sure key crates have `crev` code reviews #46

Open icefoxen opened 5 years ago

icefoxen commented 5 years ago

I recently learned about the crev tool which produces crypto signed code reviews to produce some manner of web of trust. The main innovation is that code reviews are just text files in git, and each ID includes a list of other ID's whose reviews it also trusts. While it's a young idea yet, it appeals to me, and since the WG is a place for people to get together and talk about gamedev ecosystem stuff, it seems like a possible place to coordinate efforts to get trusted reviews for a lot of baseline gamedev-related crates.

I'm sleepy and explaining this poorly. Anyway. Does this interest anyone else?

Ralith commented 5 years ago

Empirically, foundational gamedev crates seem to often be unsound, so this is both important and unlikely to go smoothly (if done right).

Lokathor commented 5 years ago

Seems okay.

One problem is that the reviews are for particular versions (right?) so we'd need these foundational crates to settle a bit more perhaps.

distransient commented 5 years ago

I think this would be great for foundational crates and the cores of engines but less so for the more feature-y stuff that would likely be often built from scratch anyways at a lot of studios

icefoxen commented 5 years ago

After messing with crev even a little, I will say, you turn up some surprising stuff in weird places.

One problem is that the reviews are for particular versions (right?)

Correct

...so we'd need these foundational crates to settle a bit more perhaps.

One side-effect I hope for is that more widespread crev use would encourage this settling a little. It's a bit absurd that rand is not at 1.0, tbh.

icefoxen commented 4 years ago

Okay, how do we define "foundational" crates? Besides, naturally, "crates that ggez uses".

In my own crev endeavors (crevdevors?), MOST crates fall into a few categories:

The first type are usually the easiest to review: no malicious build.rs, no unsafe code, go through the code looking for stuff doing anything obviously fishy, and you're probably in decent shape for at minimal review. The second type can be easy or hard but you need to know a bit about unsafe code to be able to tackle them. The last are hard 'cause they tend to be big, and you need the domain knowledge of that particular system to know if it's being abused.

Edit:

That said, to start with I'm pretty happy with a crate having just two crev reviews, if one is by the author or a major contributor (basically just saying "this is my crate and I did indeed release this version of it on purpose"). And once a crate is reviewed, reviewing a patchlevel update to it is usually pretty easy. And while the system interface crates are big and hairy, there are also generally quite few of them, used very widely, so reviewing them has very high value.

So let me propose a core list of, at least, system crates for gamedev. This list is necessarily not complete, of course.

Core "unsafe stuff abstraction" crates:

Lokathor commented 4 years ago

I'm adding a tag that we want a guide for this one.

I know that you wrote a guide Icefox but it was very long. We should try to come up with a guide that is as terse as possible while explaining what to do. I couldn't even remember how to use cargo crev when I tried to do it just now.

icefoxen commented 4 years ago

Yeah, a good HOWTO is a good idea. Crev actually has a pretty good one, last I checked. Its command line flags have mutated a little the last version or two though, so we should make sure the guides are up to date and such.

Osspial commented 4 years ago

I think Winit should also get some proper crev reviews as well, given how widely used it is and how it's pretty much all unsafe code by nature. That's a bit of a difficult project for a single person to tackle though, since it's going to be hard to review code that you can't personally test.

Lokathor commented 4 years ago

I would argue that a code review shouldn't generally involve executing the code (other than perhaps running the tests, but I'm assuming those get run by the maintainer)

Osspial commented 4 years ago

I'd argue that it's important to be able to run code when reviewing it for unsafe code in particular, since

a) being able to check code with valgrid or similar memory safety tools can give insights a standard review wouldn't b) it's more difficult to verify that code is actually unsound if you can't modify it and add debug code when you notice something suspicious

AlexEne commented 4 years ago

I kept seeing this thread pop on and on a few times, and today I decided to investigate this more seriously.

As a spoiler, I am still slightly confused at the purpose of this tool :D. It overlaps with cargo-deny and clippy and also I don't super understand the purpose of having a "web of trusted reviewers" with code reviews being git files actually does for me as a crate maintainer.

Even their page states that:

This tool helps Rust users evaluate the quality and trustworthiness of their package dependencies.

Ok that's great, Then it goes on to state:

cargo-crev can already:

warn you about untrustworthy crates and security vulnerabilities,
display useful metrics about your dependencies,
help you identify dependency-bloat,
allow you to review most suspicious dependencies and publish your findings,
use reviews produced by other users,
increase trustworthiness of your own code,
build a web of trust of other reputable users to help verify the code you use,

Isn't cargo-deny (a newer and better option to just deny/ban and warn you about shady crates) rather than a side-effect of a code review tool.

Then I went on to the getting started guide and I will show you an example here:

date: "2018-12-19T22:00:24.644210896-08:00" from: id-type: crev id: FYlr8YoYGVvDwHQxqEIs89reKKDy-oWisoO0qXXEfHE url: "https://github.com/dpc/crev-proofs" package: source: "https://crates.io" name: either version: 1.5.0 digest: uBbgCVotv_8z4SEOjremFmvMG4JPhUROC19OLjPPLNE review: thoroughness: medium understanding: high rating: strong

So it seems to keep track of thoroughness / understanding and these fields are entered by humans. Ok sure, so each crate has a rating of qualityA and qualityB. Now, these things are obviously subjective measures, even the docs are encouraging people to be truthful.

Some info and guidelines that confuse me even more:

While crev does not directly expose you to any harm from entities you trust, adding untrustworthy entities into your Web of Trust, might lower your overal security and/or reputation.

On the other hand, the more trustworthy entites in your Web of Trust, the broader the reach of it and more data it can find.

Your Proofs are cryptographically signed and will circulate in the ecosystem. While there is no explicit or implicity legal responsibiltity attached to using crev system, other people will most probably use it to judge you, your other work, etc.

This semi-gatekeepy vibe that I get by reading the docs, coupled with the fact that it seems to be like one of those devices that has 1000 features have lead me to conclude that I should just stop trying to understand it.

After I have just lost spent 1 hour on actually trying to understand this concept, I feel just as confused as I was when I started and also I really don't understand how it helps game development in particular.

Honestly cargo-deny is a way better tool for things that matter in my opinion. Speaking of trust, I probably trust this tool it as it came out of Embark and highly regard the people there :).

In any case, if anyone can explain the purpose of this thing and how it relates to gamedev, I would super appreciate it.

Lokathor commented 4 years ago

cargo-deny is useful, but it's almost entirely based around the idea of caring about crates being in your graph or not just based on the license. There's not any system that really lets you rate "should you trust this crate?".

cargo-crev is a code review system that doesn't care about licensing. Instead it lets you build up a web of reviews for a crate and it will evaluate your web to give you a listing of how much you should trust a crate to properly do what it claims to do. If a reviewer cares to use license as a review criteria they could do that I suppose, but generally a crev review is focused on security/safety of the crate, not licensing.

This is not a tool that is specific to just gamedev, but it's a general concern that we should also make sure that gamedev crates are handling properly the same as we're generally obligated to make sure that all gamedev crates follow SemVer and provide good and truthful docs and other crate quality concerns.

AlexEne commented 4 years ago

I do get that, and usually the license is kind-of a make-or-break it for a crate if you want to actually ship something that's not just another open source crate.

But for example, let me be more specific with the bits I don't understand:

  1. I maintain a crate that has some dependencies on other things, what incentive do I have to use cargo-crev ?
  2. Do these crate reviews based on the 3 qualities provided differ in any way from popularity? I am really curious about this. E.g. If I alternatively rate a crate by github stars or number of crates that depend on a crate based on crates.io data, is is less/the same/more "trusted" than what the stats from cargo crev would say? Do these things ever differ?
  3. How does it practically enforce security? As far as I can tell from the docs it's a human that enters that rating. To me this part actually makes no sense.
  4. Is there somewhere a practical example on how this is/was useful?
bitshifter commented 4 years ago

cargo deny recently added support for denying based on security advisories and abandoned crates which might overlap with what crev offers? https://github.com/EmbarkStudios/cargo-deny/blob/master/README.md#crate-advisories---cargo-deny-check-advisories

Lokathor commented 4 years ago

@bitshifter That is neat. I don't think it makes crev any less valuable.


@AlexEne I love the numbers because then I can also use numbers to sort my reply:

1) If you care about the dependencies you use then you have a natural incentive to check on them with various tools, this among them. 2) I have never nor would I ever use Github stars as a measure of anything useful. Of my 34 stars, two are for things in Rust. I simply do not star repos after I've examined them as a way to mark them as good or bad. crev is not for measuring popularity. It is a tool for helping approximate how much you should trust the code of a crate to work properly. 3) All security is human based in the end. You're trusting people who have written reviews or who have transitively decided to trust the reviews of others. If there is a rustsec security report and you accept what it claims that's also you choosing to trust someone. 4) The initiative is mostly small right now. I and others have found it useful, but if you aren't the kind of person who carefully inspects their dependency tree maybe you don't think it's useful.

icefoxen commented 4 years ago

crev and cargo-deny have some overlap but to me seem like complementary tools: cargo-deny is there to let you forbid crates based on license or your own opinion, and crev is there to help you decide which crates to forbid based on other people's opinion. You choose whose opinion you care about with crev: as you already said, "I probably trust this tool it as it came out of Embark and highly regard the people there". The goal of crev is to let you formalize and automate that.

AlexEne commented 4 years ago

GIthub stars was meant as a funny exacmple, but let's take a crate's usage numbers and crev rating I just want to know if they ever differ :). I don't know how to check that now.

For security I'd strongly suggest people go with what cargo-deny does, or tools like address sanitizer, or other automated checkers, rather than some person with a reputation of x took a look at it and said it's good enough :D. This is not how you assess security-related things.

I do see value in automated things like clippy / cargo deny or other tools like thread/address sanitizer, but to me, this is at best a different kind of popularity contest, and I am still not sure of the usefulness.

For the people that mentioned they found it useful. What actions did you take based on this tool's output? Did you replace a dependency? Get a confirmation that your dependency is good, etc? How exactly do you use it? You use it to review things / checkthe quality of the dependencies you have? From what i see the status on each of the "quality" markers can be low/medium/high. What does this even mean?/ What does medium mean? It's at best as useful as uber ratings from what I see.

I guess my ultimate point here is this really a thing we (the gamedev working group) should encourage people to use (I don't see it promoted by the other wgs.)? For example is this a thing we expect a game studio to use?

Lokathor commented 4 years ago

I don't understand in what way you think there's some sort of popularity contest, and until that's cleared up I don't think any of the rest of your points need to be addressed because I think they all stem from the same confusion.

AlexEne commented 4 years ago

I think my questions were clear and targeted, regardless of my first impression of this tool. I will repeat them here:

For the people that mentioned they found it useful. What actions did you take based on this tool's output? Did you replace a dependency? Get a confirmation that your dependency is good, etc? How exactly do you use it? You use it to review things / checkthe quality of the dependencies you have? From what i see the status on each of the "quality" markers can be low/medium/high. What does this even mean?/ What does medium mean? It's at best as useful as uber ratings from what I see.

I guess my ultimate point here is this really a thing we (the gamedev working group) should encourage people to use (I don't see it promoted by the other wgs.)? For example is this a thing we expect a game studio to use?

Lokathor commented 4 years ago

I chose to trust only a very small number of people and so none of the crates that I bothered to check had any existing reviews. Instead I found myself reviewing crates.

MaulingMonkey commented 4 years ago

GIthub stars was meant as a funny exacmple, but let's take a crate's usage numbers and crev rating I just want to know if they ever differ :). I don't know how to check that now.

Plenty of common crates are chock full of unsafe if not outright unsound code. Popularity is a poor approximation of quality.

For security I'd strongly suggest people go with what cargo-deny does, or tools like address sanitizer, or other automated checkers, rather than some person with a reputation of x took a look at it and said it's good enough :D. This is not how you assess security-related things.

Automatic checks are useful but insufficient. Part of a proper security assessment/review of a crate can consider if the crate has good miri/asan/fuzz/test coverage, and if they're employed in a way that's sufficient to catch the kinds of errors I'd expect the crate to run across.

For the people that mentioned they found it useful. What actions did you take based on this tool's output? Did you replace a dependency? Get a confirmation that your dependency is good, etc? How exactly do you use it? You use it to review things / checkthe quality of the dependencies you have? From what i see the status on each of the "quality" markers can be low/medium/high. What does this even mean

It has triggered manual source code review of various crates for me, resulting in several issues filed and fixed against crates due to unsound APIs. Quality markers are admittedly a bit fuzzy - I define how I'm using them in the readme of my crev-proofs repoistory, and then further go on to enumerate my exact rationale in the comment section of my review... which I've been told is handy by some of the people who read said reviews, and also is handy when re-reviewing a crate on account of a new release. I've chosen to avoid or replace some dependencies as a result as well.

XAMPPRocky commented 4 years ago

I'm not sure how this is related to the gamedev WG, or rather what actions the gamedev WG are supposed to take. Yes, these crates should dependable and sound, but the same is true for a lot of crates. I think it would be more productive if gamedev collaborated with the Unsafe Code Guidelines and Secure Code (who already perform security audits and publish crev reviews) working groups for improving these crates, rather than trying to take on this work entirely on themselves.

Lokathor commented 4 years ago

The UCG-wg is not about doing code reviews, and they almost for sure would not be interested in doing any of this work.

The Secure Code wg are actually already listing crev as a tool to use, and one of their members was the one to push people to using crev more often on the Community Discord (Shnatsel, actually, you can see them as the person who merged PR #2 on that repo). Their cajoling eventually lead to this issue here. However, there is simply far too much ground for the Secure Code wg to cover it all, each rust domain (web, gamedev, embedded, etc) needs to pick up some of the burden for the good of the ecosystem.

Lokathor commented 4 years ago

Clarification

It came up in the wg Discord channel from Alex that there has been some miscommunication about who does this work.

The process of doing reviews is not an extra burden for library authors, and absolutely does not place any additional work requirements on them.

If they choose to put out a review of their crate that's fine, but if they choose to not participate that's also fine. A review of a crate can be done by anyone at any time, and each person's personal web of which reviews they choose to trust is accounted for by crev.

icefoxen commented 4 years ago

IMO crev mainly adds another layer of protection against supply chain attacks -- someone steals your crates.io credentials and updates one of your popular crates with malware in it. crev can notice it automatically and maybe tell something like cargo-deny to not use that version until someone has actually looked at it. But that only works if you have a baseline of existing reviews to go off of.

However, "what actions did you take off this tool's output" is really a question that hits the heart of the matter, so thanks. Personally I've used its feedback to replace some dependencies with simpler ones (rand), make some (fairly minor) bugfixes to certain deps, and start replacing deps only to realize upon digging deeper that they're actually pretty reasonable (image). Plus I try to add reviews for my own crates so that in the case of the sort of supply chain attack above an attacker would have to compromise two accounts instead of one, for whatever that's worth. ...also, rummaging through weird crates to see what lurks in the bowels of the dependency tree is fun.

As for review quality guidelines, I find the crev guidelines pretty pragmatic and understandable. They're in the template for creating a crev review so if you're creating a review you'll have a hard time not seeing them. It may be possible to make it better, but I'm nowhere near as patient as MaulingMonkey is.

@XAMPPRocky I made this issue here because in my experience there's a decent but not huge assortment of crates that are very heavily used in gamedev and maybe don't get much use outside of it, so we have a real ecosystem to work with that is specialized enough to be useful and small enough to be feasibly tackled. We definitely should be working with other groups too, but I didn't even know the rust-secure-code WG existed. So I guess this WG is already doing its job. Thanks!

So, maybe this issue should be broader: "how should the gamedev WG make sure key crates are high quality?"

AlexEne commented 4 years ago

Thanks, things have been clarified so thanks all for the input on this, I think that's a great question to tackle, crev is one of the tools that can offer some benefit in this quality space.

I am glad people posted actions they took of it, as the examples of usage are really lacking in the official docs so this really helps me (and I hope others) in understanding how things should flow: who does the review ( I really thought for the longest time that the ask was that I - as the hypotetical crate provider- have to provide reviews for my dependencies and publish them) and how that review is used. and what actions people take based off them.

In the end, it does have the implicit assumption that people are using this in the first place to assess the quality of something or that the reviews out there are discoverable (or valuable and here comes the trust part), so there are still rough parts in this space. E.g. How do I find @Lokathor's reviews, etc. I know, I probably should go read the docs again, but as an user I do expect things to be easier than they are today, so it still has a lot of rough edges. Is it based on github username?, is it expected for me to ask him about what his reviewer id is / repo, etc. (I really don't remember how this should work so please let;s not dwell on this comment). It's just meant to be an example of rough edge that might be unclear to users trying this thing out.

Next steps I propose

I think for the particular issue here, I propose we have a way to track these reviews that we (someppl of this WG have started doing). As I said, for a long time it was unclear to me even who does these reviews. Maybe let's unify them in a repo here? (just an idea) The second thing (a bit more imprortant and meant to adress the how in the question above) that may be useful is to put in place some guidelines on how to review something (e.g. to avoid the issue of what a medium means in an assessment). That way we also keep a high and consistent bar when reveiwing these things.

MaulingMonkey commented 4 years ago

So, maybe this issue should be broader: "how should the gamedev WG make sure key crates are high quality?"

I think something goal oriented like this is very reasonable. Starting with the very basics: I want to discover, use, and to be able to recommend, high quality crates. "High quality" here includes safe and sound - no security vulnerabilities, not a crashy mess, and sound APIs so my code is also safe and sound for the same reasons.

Why use crev?

To find what crates I'm already using, and if I've audited the code before. To find and share reviews with an admittedly currently very niche audience - other crev users.

To Alex's point - crev reviews are not particularly readable, nor easily discoved by the wider audience of Rust users. To this end I've been aiming to duplicate my reviews in a github repository, so you only need a browser to read: MaulingMonkey/rust-reviews/reviews/arrayvec.md is a good example of what I'm striving for. The root index is a work in progress and partially automated - I'm working on improving that too, I need to reimplement categorization.

In the past I've had long rambling comments in my crev reviews, but going forward those may instead become links to reviews on github instead - where they're more easily readable. Crev would still be useful for coordination and verification and stuff.

Why review crates?

There's a bunch of sites out there for discovering crates, like arewegameyet, arewewebyet, areweguiyet, crates.io, lib.rs. But I want something more curated... I want to know the pros/cons of alternatives, their strengths/weaknesses, opinions as to if someone other than the author can recommend the crate to me.

There's not a lot of that kind of curation going on, so I'm resorting to doing so for myself, for my own benefit. I share my curation list to make it easier to recommend things to others (I don't have to repeat or recall what I liked about a crate).

Why audit code?

There's a lot of unsafe and unsound Rust code out there, including in some of the crates I'm using. If I'm being particularly paranoid, I'd like to catch malicious code as well. But just not wanting to miss a project deadline because I had to spend two weeks hunting down a heisenbug crash every couple months is enough reason to want to start auditing code.

What should the WG do, specifically?

So I just listed a whole bunch of personal goals and actions. What needs more than just personal action? Well, from this thread, coordination and discoverability seem like priorities.

Coordination could perhaps be done through existing channels like Rust Safety Dance, Rust Secure Code WG, RPL Discord's #wg-gamedev channel, the RPL Community Server Discord's #games-and-graphics channel, issues here in this working group, etc.

Discoverability - I don't know of a great way to boost the signal for crev directly. But perhaps when I audit/review a crate and it gets into a state I like, I can suggest cross-linking the review? E.g. arrayvec could add a line to their Readme.md?

It feels a bit too self-aggrandizing to just add my reviews to other people's Readmes, but maybe if there's a few of us...? Or etablished precedent? Could also do https://shields.io/ badges, link crev itself to boost the signal of that specifically, ...

Reviewed by: MaulingMonkey (crev), Lokathor, icefoxen, ...

crev MaulingMonkey, Lokathor, icefoxen

crev MaulingMonkey Lokathor icefoxen Ralith

XAMPPRocky commented 4 years ago

So, maybe this issue should be broader: "how should the gamedev WG make sure key crates are high quality?"

I think, thinking about making these crates high quality in terms of crev reviews is a very narrow perspective. Reviewing code is just one part of getting these crates to a high quality.

I made this issue here because in my experience there's a decent but not huge assortment of crates that are very heavily used in gamedev and maybe don't get much use outside of it

I'm sure that's true, however I would say that none of the crates you listed above fit that description. I think it would be more helpful mention what specific crates that you think need improvement (or are missing), and need to be stable for gamedev specifically than a generalist approach.