I read through this and it looks great! Comments on the sections that I had thoughts about are below, interleaved with quotes to make it clear what I'm referring to. Some are just noting my strong agreement with a particular point (or that you changed my mind), some are nits, some are providing a counterpoint to the position in the document (which you may have considered and rejected already).

Preconditions

All new code should abide by cargo fmt, cargo clippy, and cargo clippy --tests. If your crate uses features, be careful to ensure that clippy is definitely being run all on of your code.

Not 100% necessary but you could mention --all-features here.

Cosmetic discipline

Spacing

let x = foo();
if !x.is_valid() {
    return Err(Error::Invalid);
}
println!(“{x}”);

let y = baz();
if !y.is_valid() {
    return Err(Error::Invalid);
}
return Ok(y);

A couple of the code snippets use Unicode curly quotes (single and double), it might be better to replace them with ASCII quotes so the code will compile when pasted elsewhere.

Grouping

This is particularly significant where closures are used—if a closure is defined half-way through a function, does not capture anything and then is only used at the end, the reader will have to keep more things in mind for no good reason.

Might be missing context on my part but this sentence read strangely to me because I almost never read or write code that binds a name to a closure. Typically the closure is just passed as a literal to the function that wants it (Iterator::map, thread::scope, etc.). That said, in situations where you do bind a name to a closure (maybe to re-use it?) this seems like good advice.

If no captures are required, consider defining them at the top of the highest possible scope to make it obvious that no closures are needed.

Just a thought: if there are no captures and you're binding a name to the closure, should you consider just defining a fn instead? (If visibility is a concern I personally think fn within fn is fine when used tastefully.)

Pattern match variable naming

To reduce cognitive load, pattern-matched variables should be named consistently, that is, the new variable must be either:

The same as the variable/field it comes from

The first letter of the variable/field it comes from

When matching structs and struct-like enum variants, try to use the original field names.

This initially struck me as too prescriptive but after thinking about it for a bit I can see the argument. If you use this style consistently the total number of distinct identifiers in your program goes down and you can form a stronger association between identifiers and meanings, so there's more room in your working memory when reading code. Nice!

Lifetime parameter naming

I like this section!

✅ Do this:

struct ASTQueryMatch<'cursor, 'tree> { .. }

struct Value<'h> { .. }

This brings up a question: for acronyms in type names, do we use ASTQueryMatch or AstQueryMatch? The standard library seems to prefer the former, see TcpStream.

Import discipline

The rule for using * from enums is slightly different. Here, it is acceptable to import * to bring all variants of an enum into scope. However, this should not be done at the top level, only locally to improve the readability of long match statements. There, they should be placed as close as possible to the relevant match, preferably on the line immediately preceding it.

I was slightly surprised to discover that if you use SomeEnum::* in the middle of a function, the bare variant names are visible throughout the function, not just after the use. (Playground demo.) That makes me feel a little torn about this advice---I agree that sometimes you really need this kind of glob import to make your matches readable, but if use is going to affect an entire block I want to see it at the top of that block.

✅ Do this:

use some_crate::{SpecificItem1, SpecificItem2};
use some_other_crate::SpecificItem3;

// ...

fn some_fn(some_enum: SomeEnum) -> {
    // ...

    use SomeEnum::*;
        Variant2 => {...},
    }
}

Missing the match in this snippet.

Pattern matching discipline

Exhaustively match to draw attention

Pattern matching is an excellent way to ensure that all items of data in internal structures have been considered, not only by the author of the current change, but also by the authors of any future changes. When using internal interfaces, always consider using pattern-matching to force useful compiler errors in case important, possibly new, parts of a structure haven’t been considered. This in turn will draw the attention of the next maintainer and help them correctly do what they need.

✅ Do this:
impl Ord for MyStruct {
    fn cmp(&self, other: &Self) -> Ordering {
        let Self {
            my,
            thing,
            with,
            some,
            unused: _,
            fields: _,
        } = self;
        (my, thing, with, some)
            .cmp(&(other.my, other.thing, other.with, other.some))
    }
}
⚠️ Avoid this:
impl Ord for MyStruct {
    fn cmp(&self, other: &Self) -> Ordering {
        (self.my, self.thing, self.with, self.some)
            .cmp(&(other.my, other.type, other.with, other.some))
    }
}

TIL Clippy can enforce this one: https://rust-lang.github.io/rust-clippy/rust-1.56.0/index.html#rest_pat_in_fully_bound_structs

Don’t pattern-match pointers

It is possible to pattern-match the pointer to a Copy type to obtain the value at the other end. Although it may seem convenient, it ultimately harms readability—it is clearer to explicitly dereference the pointer we are given.

✅ Do this:
    .map(|x| *x)
⚠️ Avoid this:
    .map(|&x| x)

Nit: should this say "reference" instead of "pointer"?

Pattern-matched parameters

Using pattern matching in fn parameters adds extra noise to a function’s signature by duplicating definitions held elsewhere. Indeed, the fact that a particular parameter is to be pattern-matched inside of the function is not important to the user—it is an unwelcome implementation detail and should be hidden as such.

If parameters are to be unpacked, do this at the first line of a particular function.

Note that this guidance does not apply to closures, which are commonly used as small, locally-scoped helper functions, whose types are inferred.

+1, this seems like a good set of distinctions.

✅ Do this:

impl Server {
    fn new(config: ServerConfig) -> Result<Self> {
        let Config { db_path, working_path } = config;
        // ...
    }
}

Typo: Config vs. ServerConfig.

Code discipline

When not to use Self

Do not use Self when constructing associated types.

Nit: would "...when naming associated types in expressions" be clearer? (Strong +1 to the actual guideline here.)

The only exception is for trait items which return a Result<_, Self::Err>, where Err is set to the crate’s Error type. In this case, it is okay to use the crate’s Result type alias instead.

✅ Do this:
impl Responder for MyType {
    type Response = SomeStruct;
    type Err = Error;

    fn respond(&self, _input: Input) -> Result<Self::Response> {
        Ok(SomeStruct{
            some: ...,
            fields: ...,
        })
    }
}

I'm not sure I understand how this exception relates to the guideline. In the example below that uses a Result alias, the associated type is named in the function signature, not in an expression.

Struct population

Big fan of this section, no notes :)

Prefer collect when interacting with FromIterator

The FromIterator trait defines a method from_iter which is called by Iterator::collect. We therefore have two methods of collecting into an iterator, Foo::from_iter and collect() with appropriate type bounds. Prefer the latter as this makes the order of operations the same as what is read from top to bottom.

✅ Do this:
let my_vec: Vec<_> = collection.into_iter()
    .filter(...)
    .collect();
⚠️ Avoid this:
let my_vec = Vec::from_iter(collection.into_iter().filter(...))

In the "good" snippet, should we prefer the type annotation to a turbofish? (I go back and forth on this myself.)

Empty Vec construction

Consider also that vec![expr; n] where n is zero will still evaluate (and then immediately drop) expr.

Good point, hadn't considered this!

Avoid loosely-scoped let mut

In many cases, mutability is used to create a given structure which is then used immutably for the remainder of its lifetime. Whenever this happens, scope the mutable declarations to just where they are needed, thus forcing a compiler error if this condition is broken in future. Doing this also makes code simpler to read as there are fewer things which can mutate at any one point.

As an alternative to introducing a nested scope, what about shadowing?

let mut thing = 0;
poke(&mut thing);
let thing = thing;
// no more mutability here

Reference scope

In many cases, the compiler is smart enough to create temporary storage locations to store variables which are given the value &expr, however, when these are passed to functions, it becomes slightly harder to follow which type is being used, especially when handling &T where T is !Copy. In this case, it is only a single character in the variable declaration, possibly many lines away which shows that the value T is not being moved, only its reference.

Instead of relying on temporary storage locations, store the value explicitly and take a reference where needed. This way, the transfer of ownership is much more explicit. As a rule of thumb, only use & at the start of the value of a let declaration when either indexing or slicing.

Sometimes it's not a temporary storage location---rustc can in some cases implicitly put the thing being referenced in static storage and give you a 'static reference back. This works with e.g. integer literals but also sometimes with more complicated stuff, and it can be handy, for example this code in std. See also this thread by a bunch of Rust wizards that digs into the nuances.

Shadowing

However, if this is being used to effectively mutate a value during construction with no other values being affected, instead use the scoped-mutability pattern—
let thing = {
    let mut my_thing = ...;
    // Mutate `my_thing` to construct it...
    my_thing
};

Ah, you anticipated the thing I wrote above :) I think I would personally prefer shadowing when using this idiom just because an extra level of nesting (and two additional lines) feels like a stiff price to pay, but I can appreciate that the explicit scope helps out too.

Type annotations

This section answers another of the questions I wrote above! Agree with the guidance, thanks for spelling it out.

Avoid explicit drop calls

If |_| ... is used and the ignored parameter is an Error, we should highlight that an error is intentionally being ignored, for example by using .ok() on a Result being handled.

I've always been mildly averse to .ok() as an idiom for ignoring errors, but this made me rethink that position. After all, it's pretty much the same as .map_err(|_| ()), in the sense that Option<T> is isomorphic to Result<T, ()>. So +1 on this.

If converting from some Result<T> to a Result<()> at the end of the last expression of a function, instead of the ignore marker, use the ? operator and an explicit Ok(()). This highlights that we care only about side-effects, and that no information is returned in the successful case.

Agree, one additional benefit of Ok(()) is that some diffs are cleaner when you use it, e.g.

  bar()?;
+ baz()?;
  Ok(())

vs.

- bar();
+ bar()?;
+ baz()

✅ Do this:

async fn log(&self, message: String) -> Result<()> {
    {
        let mut file = OpenOptions::new()
            .append(true)
            .open(&self.log_file_path)?;
        file.write_all(message.as_bytes())?;
    }
    self.transmit_log(message).await?;
    Ok(())
}

⚠️ Avoid this:

async fn log(&self, message: String) -> Result<()> {
    let mut file = OpenOptions::new()
        .append(true)
        .open(&self.log_file_path)?;
    file.write_all(message.as_bytes())?;
    drop(file);
    Ok(self.transmit_log(message)
        .await
        .map(drop))
}

Micro-nit: the snippets look like they're doing blocking file I/O in an async function, which is best avoided and might distract from the point they're intended to illustrate.

Prefer constructors

When exposing structs, prefer to expose constructor functions or a builder rather than exposing public fields. There are several benefits here:

They format more nicely in call chains

They allow parameter type conversions to occur implicitly

They allow some fields to be computed in terms of others using implementation-specific details

Structs with all-public fields cannot benefit from any of the above and moreover, if it is later decided that any of these properties is beneficial, we face either a breaking change to fix it or extra complication to work around it.

Just for counterpoint, some disadvantages I can think of to using field getters instead of public fields:

Loss of pattern matching
Getters that return references don't support borrow splitting
Extra code

And I could be wrong, but I think struct initialization supports all the same coercions as function calls (reference).

The exception here is for ‘config’ structs, which are passed to single function and which configure its behaviour (e.g. FooConfig may be passed to fn foo). As the purpose of these structs is only to pass data to another part of the codebase, simplifying construction is beneficial. To defend against the future addition of new fields causing breaking changes, consider marking them as #[non_exhaustive] and adding a Default implementation.

I might consider broadening this, I think there are other cases where "all fields public" is the right choice, with the common thread being that these types are just containers for data that don't have invariants. This blog post shaped my thinking on this.

API-specific serde implementations

Data-formats returned from remote APIs should not govern internal representations without good reason, as unpredictable remote API changes could lead to large breakages. To avoid this, it is good practice to define local (de)serialisation types which closely match the expected form of the remote API and then map data from those types into our internal ones.

Define these (de)serialisation types in the functions which implement the necessary API calls. Let’s say we have a function called get_image_info, which makes a web-request to get information associated with given container image name (e.g. author, description, latest version). To nicely transfer data from some remote format into one we govern, say ImageInfo, add an explicit return at the end of get_image_info and below this, create a new type called Response, which implements Deserialize. Add a comment which says // serde types. to let the reader know that everything beyond this point only relates to modelling the remote API. Add as many new local types as are necessary to maintain a 1:1 relationship between Rust types and the remote’s format—
async fn get_image_info(&self, name: &str) -> Result<ImageInfo> {
    let response: Response = serde_json::from_str(get_response(...).await?.text());
    let info = ImageInfo {
        name,
        version: response.metadata.version,
        authors: response.metadata.authors,
        latest_release: response.releases.last()
            .map(|release| ...),
    }
    return Ok(info);

    // serde types.
    #[derive(Deserialize)]
    struct Response {
        metadata: Metadata,
        releases: Vec<Release>,
    }

    #[derive(Deserialize)]
    struct Metadata {
        version: String,
        authors: Vec<String>,
    }

    #[derive(Deserialize)]
    struct Release {
        ...
    }
}
For consistency, we try to always call incoming data Response and outgoing data Message. Name shadowing is okay here as the scope is small and the shadowed type will likely only appear once in a very predictable place and with a predictable name.

Scoping the (de)serialisation in this way is extremely good practice for several reasons:

Firstly, it minimises the blast radius of incoming remote API changes. If a remote API is changed, we need only update the deserialisation structs and their unpacking into our internal ones—the core of our program/library remains untouched.

Secondly, it minimises the amount of code which must be read—if the interaction with the remote API is functioning correctly but someone wishes to know how this function works, they know that they can stop reading past the // serde structs. marker. Conversely, if the API interaction is broken due to a data format ‘surprise,’ that same comment draws the maintainer’s eye to the place they need.

Thirdly, it is often simpler to implement the Serde traits on these types! As we model the remote structure before unpacking into internal structures, less serde-wrangling is required.

Finally, it reduces the amount of clutter in file-level scopes. As the Response types are locally-scoped, the reader knows exactly where they are used and hence does not need to keep them in mind alongside the rest of the codebase. They may be safely forgotten until needed.

I really like this section, but it feels a little out of place here because it's so specific. Maybe it could be generalized to a broad guideline for when to define functions and types within a fn body?

Method calls on closing curly braces

Control structures and struct literals should not have methods called upon them as the formatter moves method calls onto the line below. This adds an unwelcome surprise as the scope of what the reader is currently looking at will appear to increase, adding to cognitive load and potential confusion. To avoid this, use a binding (let some_var = ...; some_var.foo()).

Strong +1.

Error and panic discipline

Error types

All reasonable types which implement Error fall into one of three categories:

Those which erase the underlying types

Those which preserve them, for example by enumeration

Those which preserve them opaquely

Errors which use type-erasure (e.g. Box<dyn Error> and anyhow::Error) are often easier to use when writing code, however things become very problematic later on when attempting to inspect errors—with less help from the compiler comes far more places for subtle breakages to occur, both now and in future. Type-erased errors should only be used in prototypes where maintenance will never be a concern, otherwise, use concrete types. As a general rule, type erased errors must not be used in library crates.

This reads to me as ambiguous on whether it's okay to use anyhow::Error in a binary crate.

Type erasure is a very strong opinion and one which may not be shared by a crate’s dependants and the process of converting from erased errors back to a concrete one is unreliable and unpleasant, and hence will irritate consumers.

Errors which preserve types (e.g those annotated with #[derive(thiserror::Error)]) give Rust a unique advantage—not only can the golden path receive first-class support, but so too can the error path, thus allowing an even higher level of quality to be attained. In particular, the process of responding to particular errors is far more robust with enumerated errors.

I totally agree about the advantages of enumerated errors, but I want to offer a counterpoint to this. If your library exposes all the underlying errors from its public Error type, you're turning making internal details of your library a public API promise. Sometimes this is what you want, but not always. For example, you might have a library that makes some HTTP requests using reqwest as an implementation detail. In that case you probably don't want to put expose reqwest::Error from your error type, because that prevents you from switching to a different HTTP client (or even a new incompatible version of reqwest) without bumping your library's major version.

This also applies to impl From<DependencyError> for MyError, since that becomes part of the public API in the same way (handy as it is for ?).

So I think a balance has to be struck between transparency and opacity by weighing the tradeoffs in each case.

Errors which preserve types but which represent unrecoverable errors should represent their error condition as a contained &‘static str or String which is assigned where the error is constructed. When constructing these errors, special care must be taken to ensure that the message is consistent with other errors in the codebase. The field used to hold the reason for the error in these cases should be named reason.

An example might be good for this one.

Panic calmly

Panics must only be used if a program enters an unrecoverable state. Further, that unrecoverable state must not be as a result of user input—a program which a user can easily crash is not a good program.

In Rust, panics are very aggressive. Not only are they unrecoverable within the same thread, but also if the default panic strategy is overridden (e.g. by a user who wants a smaller binary and hence sets profile.release.panic = "abort" in their Cargo.toml), we have no guarantee that the usual cleanup is performed.

Panics are recoverable within the same thread using catch_unwind, but the point stands because that goes out the window if your crate is built with panic = "abort" (or if you panic while another panic is unwinding the stack).

As a rule of thumb, .unwrap() should only be used in code in tiny scopes, where errors can only possibly originate from the programmer—e.g. in Regex::new("...").unwrap(), where a panic can only occur if the raw regex constant is invalid.

Good criterion!

Function discipline

Hide generic type parameters

Generic type parameters add complication to an API. Where possible, hide generic parameters either through elision, syntactic sugar (impl Trait) or by leaving them unbound.

I don't have a super strong opinion on this but there's an argument to be made that generic type parameters at least are important for understanding an API (because they trigger you to think about monomorphization and using the same parameter name in analogous contexts helps tie the code together).

On impl blocks, only introduce strictly-necessary type-constraints. Not only will this reduce the cognitive overhead of understanding large blocks, it will also help make code more easily applicable in new scenarios.

Definitely +1.

Note that although it is possible to omit the unnamed lifetime (i.e. it may be possible to write MyRef<'_> as MyRef), this should never be done. A type without lifetime parameters looks completely self-contained and hence as though may be freely passed around. If a lifetime is present, always communicate that fact (i.e. always prefer MyRef<'_> to MyRef).

Shout it from the rooftops! I always enforce this in my code with #![deny(elided_lifetimes_in_paths)].

Unused parameters default implementations

Missing "in"?

Ordering discipline

Struct field ordering

The more public a field, the more likely a user will to want to know more about it and understand it. Therefore, we should put the items they are most likely to care about nearer the top of our code, avoiding them having to skip over parts uninteresting to them. Specifically, this means that we should place:

pub fields first,

pub(crate) fields next,

private fields last.

I already linked this blog post above but it's relevant here too. For structs that do have a mix of field visibilities though, this is a good guideline.

If the reason for ordering fields in a different way to the above is due to a derivation such as Ord or PartialOrd, this is not a good reason for deviation from the norm. Maintaining consistency is of a higher priority than a single derivation, hence the relevant implementations should be written out by hand.

I was going to object to this by saying that the handwritten Ord/PartialOrd implementation gets nasty when you have more than a few fields, but then I realized you had an example of how to do it concisely above :)

Unsafe discipline

Minimise unsafe

Rust’s unsafe keyword turns off a small but important number of the compiler’s checks.

Nit: I would frame it as enabling additional powers rather than turning off checks, as the book does.

In effect, it is a ‘hold my beer’ marker—you tell the compiler to trust you and just watch whilst you do something either incredibly impressive or incredibly harmful. But the compiler is not the only entity whose trust we require, when we use unsafe, we also ask our users to trust that we know exactly what we are doing. Under no circumstance do we want to break that trust.

Well put!

Structural discipline

How to structure mod.rs

Files named mod.rs must only be used to specify the structure of a project, if definitions are added, they quickly become messy, ultimately detracting from their core purpose of declaring sub-modules and the current module’s interface.

Another item that I was going to object to, and then on thinking about it realized that it was making a good point :)

Note that these guidelines also hold for lib.rs, with the one exception that a crate’s Error and Result types are permitted in lib.rs, given their central importance.

Might there be other items in this category, like a central trait that ties the library together?

Use mod.rs to declare a module-root

I'm glad that you also stan mod.rs :)

Define Error and Result in a standard location

On the topic of crate-specific Result aliases, would it make sense to recommend always importing them as somecrate::Result, so that std::result::Result is not shadowed?

canonical / rust-best-practices

Feedback #1

Preconditions

Cosmetic discipline

Spacing

Grouping

Pattern match variable naming

Lifetime parameter naming

Import discipline

Pattern matching discipline

Exhaustively match to draw attention

Don’t pattern-match pointers

Pattern-matched parameters

Code discipline

When not to use `Self`

Struct population

Prefer `collect` when interacting with `FromIterator`

Empty `Vec` construction

Avoid loosely-scoped `let mut`

Reference scope

Shadowing

Type annotations

Avoid explicit `drop` calls

Prefer constructors

API-specific `serde` implementations

Method calls on closing curly braces

Error and panic discipline

Error types

Panic calmly

Function discipline

Hide generic type parameters

Unused parameters default implementations

Ordering discipline

Struct field ordering

Unsafe discipline

Minimise unsafe

Structural discipline

How to structure `mod.rs`

Use `mod.rs` to declare a module-root

Define `Error` and `Result` in a standard location

canonical / rust-best-practices

Feedback #1

Preconditions

Cosmetic discipline

Spacing

Grouping

Pattern match variable naming

Lifetime parameter naming

Import discipline

Pattern matching discipline

Exhaustively match to draw attention

Don’t pattern-match pointers

Pattern-matched parameters

Code discipline

When not to use Self

Struct population

Prefer collect when interacting with FromIterator

Empty Vec construction

Avoid loosely-scoped let mut

Reference scope

Shadowing

Type annotations

Avoid explicit drop calls

Prefer constructors

API-specific serde implementations

Method calls on closing curly braces

Error and panic discipline

Error types

Panic calmly

Function discipline

Hide generic type parameters

Unused parameters default implementations

Ordering discipline

Struct field ordering

Unsafe discipline

Minimise unsafe

Structural discipline

How to structure mod.rs

Use mod.rs to declare a module-root

Define Error and Result in a standard location

When not to use `Self`

Prefer `collect` when interacting with `FromIterator`

Empty `Vec` construction

Avoid loosely-scoped `let mut`

Avoid explicit `drop` calls

API-specific `serde` implementations

How to structure `mod.rs`

Use `mod.rs` to declare a module-root

Define `Error` and `Result` in a standard location