Open kbknapp opened 6 years ago
As version 1.0 does not seem to be ready yet, could you consider releasing a new v0.3.x versions with some updated dependencies (expecially: winapi-0.4, which allows to drop kernel32-sys)? Thanks!
I took liberty to check all the checkboxes because all of them were either done with or irrelevant. @kbknapp can we just release the 1.0?
In an effort to get 1.x out this will be the summary issue linking to all tracking issues. Many of the issues that will come out of this will be excellent "first time" issues that I'd be willing to mentor!
I will be updating this summary periodically (in addition to working on all the other issues in the queue) as well as linking to tracking issues for individual tracking issues as I create them.
From Rust API Guidelines
Rust API guidelines
Crate conformance checklist
as_
,to_
,into_
conventions (C-CONV)iter
,iter_mut
,into_iter
(C-ITER)_mut
and_ref
(C-OWN-SUFFIX)Copy
,Clone
,Eq
,PartialEq
,Ord
,PartialOrd
,Hash
Debug
,Display
,Default
From
,AsRef
,AsMut
(C-CONV-TRAITS)FromIterator
andExtend
(C-COLLECT)Serialize
,Deserialize
(C-SERDE)"serde"
cfg option that enables Serde (C-SERDE-CFG)Send
andSync
where possible (C-SEND-SYNC)Send
andSync
(C-SEND-SYNC-ERR)()
(C-MEANINGFUL-ERR)Hex
,Octal
,Binary
formatting (C-NUM-FMT)?
, nottry!
, notunwrap
(C-QUESTION-MARK)Deref
andDerefMut
(C-DEREF)Deref
andDerefMut
never fail (C-DEREF-FAIL)bool
orOption
(C-CUSTOM-TYPE)bitflags
, not enums (C-BITFLAG)Debug
(C-DEBUG)Debug
representation is never empty (C-DEBUG-NONEMPTY)Organization
Crate root re-exports common functionality (C-REEXPORT)
Crates
pub use
the most common types for convenience, so that clients do not have to remember or write the crate's module hierarchy to use these types.Re-exporting is covered in more detail in the The Rust Programming Language under Crates and Modules.
Examples from
serde_json
The
serde_json::Value
type is the most commonly used type fromserde_json
. It is a re-export of a type that lives elsewhere in the module hierarchy, atserde_json::value::Value
. Theserde_json::value
module defines other JSON-value-related things that are not re-exported. For exampleserde_json::value::Index
is the trait that defines types that can be used to index into aValue
using square bracket indexing notation. TheIndex
trait is not re-exported at the crate root because it would be comparatively rare for a client crate to need to refer to it.In addition to types, functions can be re-exported as well. In
serde_json
theserde_json::from_str
function is a re-export of a function from theserde_json::de
deserialization module, which contains other less common deserialization-related functionality that is not re-exported.Modules provide a sensible API hierarchy (C-HIERARCHY)
Examples from Serde
The
serde
crate is two independent frameworks in one crate - a serialization half and a deserialization half. The crate is divided accordingly intoserde::ser
andserde::de
. Part of the deserialization framework is isolated underserde::de::value
because it is a relatively large API surface that is relatively unimportant, and it would crowd the more common, more important functionlity located inserde::de
if it were to share the same namespace.Naming
Casing conforms to RFC 430 (C-CASE)
Basic Rust naming conventions are described in RFC 430.
In general, Rust tends to use
CamelCase
for "type-level" constructs (types and traits) andsnake_case
for "value-level" constructs. More precisely:snake_case
CamelCase
CamelCase
CamelCase
snake_case
snake_case
new
orwith_more_details
from_some_other_type
snake_case
SCREAMING_SNAKE_CASE
SCREAMING_SNAKE_CASE
CamelCase
, usually single uppercase letter:T
lowercase
, usually a single letter:'a
,'de
,'src
In
CamelCase
, acronyms count as one word: useUuid
rather thanUUID
. Insnake_case
, acronyms are lower-cased:is_xid_start
.In
snake_case
orSCREAMING_SNAKE_CASE
, a "word" should never consist of a single letter unless it is the last "word". So, we havebtree_map
rather thanb_tree_map
, butPI_2
rather thanPI2
.Examples from the standard library
The whole standard library. This guideline should be easy!
Ad-hoc conversions follow
as_
,to_
,into_
conventions (C-CONV)Conversions should be provided as methods, with names prefixed as follows:
as_
to_
into_
For example:
str::as_bytes()
gives a&[u8]
view into a&str
, which is free.str::to_owned()
copies a&str
to a newString
, which may require memory allocation.String::into_bytes()
takes ownership aString
and yields the underlyingVec<u8>
, which is free.BufReader::into_inner()
takes ownership of a buffered reader and extracts out the underlying reader, which is free. Data in the buffer is discarded.BufWriter::into_inner()
takes ownership of a buffered writer and extracts out the underlying writer, which requires a potentially expensive flush of any buffered data.Conversions prefixed
as_
andinto_
typically decrease abstraction, either exposing a view into the underlying representation (as
) or deconstructing data into its underlying representation (into
). Conversions prefixedto_
, on the other hand, typically stay at the same level of abstraction but do some work to change one representation into another.More examples from the standard library
Result::as_ref
RefCell::as_ptr
Path::to_str
slice::to_vec
Option::into_iter
AtomicBool::into_inner
Methods on collections that produce iterators follow
iter
,iter_mut
,into_iter
(C-ITER)Per RFC 199.
For a container with elements of type
U
, iterator methods should be named:This guideline applies to data structures that are conceptually homogeneous collections. As a counterexample, the
str
type is slice of bytes that are guaranteed to be valid UTF-8. This is conceptually more nuanced than a homogeneous collection so rather than providing theiter
/iter_mut
/into_iter
group of iterator methods, it providesstr::bytes
to iterate as bytes andstr::chars
to iterate as chars.This guideline applies to methods only, not functions. For example
percent_encode
from theurl
crate returns an iterator over percent-encoded string fragments. There would be no clarity to be had by using aniter
/iter_mut
/into_iter
convention.Examples from the standard library
Vec::iter
Vec::iter_mut
Vec::into_iter
BTreeMap::iter
BTreeMap::iter_mut
Iterator type names match the methods that produce them (C-ITER-TY)
A method called
into_iter()
should return a type calledIntoIter
and similarly for all other methods that return iterators.This guideline applies chiefly to methods, but often makes sense for functions as well. For example the
percent_encode
function from theurl
crate returns an iterator type calledPercentEncode
.These type names make the most sense when prefixed with their owning module, for example
vec::IntoIter
.Examples from the standard library
Vec::iter
returnsIter
Vec::iter_mut
returnsIterMut
Vec::into_iter
returnsIntoIter
BTreeMap::keys
returnsKeys
BTreeMap::values
returnsValues
Ownership suffixes use
_mut
,_ref
(C-OWN-SUFFIX)Functions often come in multiple variants: immutably borrowed, mutably borrowed, and owned.
The right default depends on the function in question. Variants should be marked through suffixes.
Exceptions
In the case of iterators, the moving variant can also be understood as an
into
conversion,into_iter
, andfor x in v.into_iter()
reads arguably better thanfor x in v.iter_move()
, so the convention isinto_iter
.For mutably borrowed variants, if the
mut
qualifier is part of a type name, it should appear as it would appear in the type. For exampleVec::as_mut_slice
returns a mut slice; it does what it says.Immutably borrowed by default
If
foo
uses/produces an immutable borrow by default, use:_mut
suffix (e.g.foo_mut
) for the mutably borrowed variant._move
suffix (e.g.foo_move
) for the owned variant.Examples from the standard library
TODO rust-api-guidelines#37
Owned by default
If
foo
uses/produces owned data by default, use:_ref
suffix (e.g.foo_ref
) for the immutably borrowed variant._mut
suffix (e.g.foo_mut
) for the mutably borrowed variant.Examples from the standard library
std::io::BufReader::get_ref
std::io::BufReader::get_mut
Single-element containers implement appropriate getters (C-GETTERS)
Single-element contains where accessing the element cannot fail should implement
get
andget_mut
, with the following signatures.Single-element containers where the element is
Copy
(e.g.Cell
-like containers) should instead return the value directly, and not implement a mutable accessor. TODO rust-api-guidelines#44For getters that do runtime validation, consider adding unsafe
_unchecked
variants.Examples from the standard library
std::io::Cursor::get_mut
std::ptr::Unique::get_mut
std::sync::PoisonError::get_mut
std::sync::atomic::AtomicBool::get_mut
std::collections::hash_map::OccupiedEntry::get_mut
<[_]>::get_unchecked
Interoperability
Types eagerly implement common traits (C-COMMON-TRAITS)
Rust's trait system does not allow orphans: roughly, every
impl
must live either in the crate that defines the trait or the implementing type. Consequently, crates that define new types should eagerly implement all applicable, common traits.To see why, consider the following situation:
std
defines traitDisplay
.url
defines typeUrl
, without implementingDisplay
.webapp
imports from bothstd
andurl
,There is no way for
webapp
to addDisplay
tourl
, since it defines neither. (Note: the newtype pattern can provide an efficient, but inconvenient workaround.The most important common traits to implement from
std
are:Copy
Clone
Eq
PartialEq
Ord
PartialOrd
Hash
Debug
Display
Default
Conversions use the standard traits
From
,AsRef
,AsMut
(C-CONV-TRAITS)The following conversion traits should be implemented where it makes sense:
From
TryFrom
AsRef
AsMut
The following conversion traits should never be implemented:
Into
TryInto
These traits have a blanket impl based on
From
andTryFrom
. Implement those instead.Examples from the standard library
From<u16>
is implemented foru32
because a smaller integer can always be converted to a bigger integer.From<u32>
is not implemented foru16
because the conversion may not be possible if the integer is too big.TryFrom<u32>
is implemented foru16
and returns an error if the integer is too big to fit inu16
.From<Ipv6Addr>
is implemented forIpAddr
, which is a type that can represent both v4 and v6 IP addresses.Collections implement
FromIterator
andExtend
(C-COLLECT)FromIterator
andExtend
enable collections to be used conveniently with the following iterator methods:Iterator::collect
Iterator::partition
Iterator::unzip
FromIterator
is for creating a new collection containing items from an iterator, andExtend
is for adding items from an iterator onto an existing collection.Examples from the standard library
Vec<T>
implements bothFromIterator<T>
andExtend<T>
.Data structures implement Serde's
Serialize
,Deserialize
(C-SERDE)Types that play the role of a data structure should implement
Serialize
andDeserialize
.An example of a type that plays the role of a data structure is
linked_hash_map::LinkedHashMap
.An example of a type that does not play the role of a data structure is
byteorder::LittleEndian
.Crate has a
"serde"
cfg option that enables Serde (C-SERDE-CFG)If the crate relies on
serde_derive
to provide Serde impls, the name of the cfg can still be simply"serde"
by using this workaround. Do not use a different name for the cfg like"serde_impls"
or"serde_serialization"
.Types are
Send
andSync
where possible (C-SEND-SYNC)Send
andSync
are automatically implemented when the compiler determines it is appropriate.In types that manipulate raw pointers, be vigilant that the
Send
andSync
status of your type accurately reflects its thread safety characteristics. Tests like the following can help catch unintentional regressions in whether the type implementsSend
orSync
.Error types are
Send
andSync
(C-SEND-SYNC-ERR)An error that is not
Send
cannot be returned by a thread run withthread::spawn
. An error that is notSync
cannot be passed across threads using anArc
. These are common requirements for basic error handling in a multithreaded application.Binary number types provide
Hex
,Octal
,Binary
formatting (C-NUM-FMT)std::fmt::UpperHex
std::fmt::LowerHex
std::fmt::Octal
std::fmt::Binary
These traits control the representation of a type under the
{:X}
,{:x}
,{:o}
, and{:b}
format specifiers.Implement these traits for any number type on which you would consider doing bitwise manipulations like
|
or&
. This is especially appropriate for bitflag types. Numeric quantity types likestruct Nanoseconds(u64)
probably do not need these.Error types are meaningful, not
()
(C-MEANINGFUL-ERR)When defining functions that return
Result
, and the error carries no useful additional information, do not use()
as the error type.()
does not implementstd::error::Error
, and this causes problems for callers that expect to be able to convert errors toError
. Common error handling libraries like error-chain expect errors to implementError
.Instead, define a meaningful error type specific to your crate.
Examples from the standard library
ParseBoolError
is returned when failing to parse a bool from a string.Macros
Input syntax is evocative of the output (C-EVOCATIVE)
Rust macros let you dream up practically whatever input syntax you want. Aim to keep input syntax familiar and cohesive with the rest of your users' code by mirroring existing Rust syntax where possible. Pay attention to the choice and placement of keywords and punctuation.
A good guide is to use syntax, especially keywords and punctuation, that is similar to what will be produced in the output of the macro.
For example if your macro declares a struct with a particular name given in the input, preface the name with the keyword
struct
to signal to readers that a struct is being declared with the given name.Another example is semicolons vs commas. Constants in Rust are followed by semicolons so if your macro declares a chain of constants, they should likely be followed by semicolons even if the syntax is otherwise slightly different from Rust's.
Macros are so diverse that these specific examples won't be relevant, but think about how to apply the same principles to your situation.
Item macros compose well with attributes (C-MACRO-ATTR)
Macros that produce more than one output item should support adding attributes to any one of those items. One common use case would be putting individual items behind a cfg.
Macros that produce a struct or enum as output should support attributes so that the output can be used with derive.
Item macros work anywhere that items are allowed (C-ANYWHERE)
Rust allows items to be placed at the module level or within a tighter scope like a function. Item macros should work equally well as ordinary items in all of these places. The test suite should include invocations of the macro in at least the module scope and function scope.
As a simple example of how things can go wrong, this macro works great in a module scope but fails in a function scope.
Item macros support visibility specifiers (C-MACRO-VIS)
Follow Rust syntax for visibility of items produced by a macro. Private by default, public if
pub
is specified.Type fragments are flexible (C-MACRO-TY)
If your macro accepts a type fragment like
$t:ty
in the input, it should be usable with all of the following:u8
,&str
m::Data
::base::Data
super::Data
Vec<String>
As a simple example of how things can go wrong, this macro works great with primitives and absolute paths but fails with relative paths.
Documentation
Crate level docs are thorough and include examples (C-CRATE-DOC)
See RFC 1687.
All items have a rustdoc example (C-EXAMPLE)
Every public module, trait, struct, enum, function, method, macro, and type definition should have an example that exercises the functionality.
The purpose of an example is not always to show how to use the item. For example users can be expected to know how to instantiate and match on an enum like
enum E { A, B }
. Rather, an example is often intended to show why someone would want to use the item.This guideline should be applied within reason.
A link to an applicable example on another item may be sufficient. For example if exactly one function uses a particular type, it may be appropriate to write a single example on either the function or the type and link to it from the other.
Examples use
?
, nottry!
, notunwrap
(C-QUESTION-MARK)Like it or not, example code is often copied verbatim by users. Unwrapping an error should be a conscious decision that the user needs to make.
A common way of structuring fallible example code is the following. The lines beginning with
#
are compiled bycargo test
when building the example but will not appear in user-visible rustdoc.Function docs include error conditions in "Errors" section (C-ERROR-DOC)
Per RFC 1574.
This applies to trait methods as well. Trait methods for which the implementation is allowed or expected to return an error should be documented with an "Errors" section.
Examples from the standard library
Some implementations of the
std::io::Read::read
trait method may return an error.Function docs include panic conditions in "Panics" section (C-PANIC-DOC)
Per RFC 1574.
This applies to trait methods as well. Traits methods for which the implementation is allowed or expected to panic should be documented with a "Panics" section.
Examples from the standard library
The
Vec::insert
method may panic.Prose contains hyperlinks to relevant things (C-LINK)
Links to methods within the same type usually look like this:
Links to other types usually look like this:
Links may also point to a parent or child module:
This guideline is officially recommended by RFC 1574 under the heading ["Link all the things"].
Cargo.toml publishes CI badges for tier 1 platforms (C-CI)
The Rust compiler regards tier 1 platforms as "guaranteed to work." Specifically they will each satisfy the following requirements:
Stable, high-profile crates should meet the same level of rigor when it comes to tier 1. To prove it, Cargo.toml should publish CI badges.
Cargo.toml includes all common metadata (C-METADATA)
authors
description
license
homepage
(though see rust-api-guidelines#26)documentation
repository
readme
keywords
categories
Crate sets html_root_url attribute (C-HTML-ROOT)
It should point to
"https://docs.rs/$crate/$version"
.Cargo.toml should contain a note next to the version to remember to bump the
html_root_url
when bumping the crate version.Cargo.toml documentation key points to docs.rs (C-DOCS-RS)
It should point to
"https://docs.rs/$crate"
.Predictability
Smart pointers do not add inherent methods (C-SMART-PTR)
For example, this is why the
Box::into_raw
function is defined the way it is.If this were defined as an inherent method instead, it would be confusing at the call site whether the method being called is a method on
Box<T>
or a method onT
.Conversions live on the most specific type involved (C-CONV-SPECIFIC)
When in doubt, prefer
to_
/as_
/into_
tofrom_
, because they are more ergonomic to use (and can be chained with other methods).For many conversions between two types, one of the types is clearly more "specific": it provides some additional invariant or interpretation that is not present in the other type. For example,
str
is more specific than&[u8]
, since it is a UTF-8 encoded sequence of bytes.Conversions should live with the more specific of the involved types. Thus,
str
provides both theas_bytes
method and thefrom_utf8
constructor for converting to and from&[u8]
values. Besides being intuitive, this convention avoids polluting concrete types like&[u8]
with endless conversion methods.Functions with a clear receiver are methods (C-METHOD)
Prefer
over
for any operation that is clearly associated with a particular type.
Methods have numerous advantages over functions:
T
" (especially when using rustdoc).self
notation, which is more concise and often more clearly conveys ownership distinctions.Functions do not take out-parameters (C-NO-OUT)
Prefer
over
for returning multiple
Bar
values.Compound return types like tuples and structs are efficiently compiled and do not require heap allocation. If a function needs to return multiple values, it should do so via one of these types.
The primary exception: sometimes a function is meant to modify data that the caller already owns, for example to re-use a buffer:
Operator overloads are unsurprising (C-OVERLOAD)
Operators with built in syntax (
*
,|
, and so on) can be provided for a type by implementing the traits instd::ops
. These operators come with strong expectations: implementMul
only for an operation that bears some resemblance to multiplication (and shares the expected properties, e.g. associativity), and so on for the other traits.Only smart pointers implement
Deref
andDerefMut
(C-DEREF)The
Deref
traits are used implicitly by the compiler in many circumstances, and interact with method resolution. The relevant rules are designed specifically to accommodate smart pointers, and so the traits should be used only for that purpose.Examples from the standard library
Box<T>
String
is a smart pointer tostr
Rc<T>
Arc<T>
Cow<'a, T>
Deref
andDerefMut
never fail (C-DEREF-FAIL)Because the
Deref
traits are invoked implicitly by the compiler in sometimes subtle ways, failure during dereferencing can be extremely confusing.Constructors are static, inherent methods (C-CTOR)
In Rust, "constructors" are just a convention:
Constructors are static (no
self
) inherent methods for the type that they construct. Combined with the practice of fully importing type names, this convention leads to informative but concise construction:This convention also applied to conversion constructors (prefix
from
rather thannew
).Constructors for structs with sensible defaults allow clients to concisely override using the struct update syntax.
Examples from the standard library
std::io::Error::new
is the commonly used constructor for an IO error.std::io::Error::from_raw_os_error
is a constructor based on an error code received from the operating system.Flexibility
Functions expose intermediate results to avoid duplicate work (C-INTERMEDIATE)
Many functions that answer a question also compute interesting related data. If this data is potentially of interest to the client, consider exposing it in the API.
Examples from the standard library
Vec::binary_search
does not return abool
of whether the value was found, nor anOption<usize>
of the index at which the value was maybe found. Instead it returns information about the index if found, and also the index at which the value would need to be inserted if not found.String::from_utf8
may fail if the input bytes are not UTF-8. In the error case it returns an intermediate result that exposes the byte offset up to which the input was valid UTF-8, as well as handing back ownership of the input bytes.Caller decides where to copy and place data (C-CALLER-CONTROL)
If a function requires ownership of an argument, it should take ownership of the argument rather than borrowing and cloning the argument.
If a function does not require ownership of an argument, it should take a shared or exclusive borrow of the argument rather than taking ownership and dropping the argument.
The
Copy
trait should only be used as a bound when absolutely needed, not as a way of signaling that copies should be cheap to make.Functions minimize assumptions about parameters by using generics (C-GENERIC)
The fewer assumptions a function makes about its inputs, the more widely usable it becomes.
Prefer
over any of
if the function only needs to iterate over the data.
More generally, consider using generics to pinpoint the assumptions a function needs to make about its arguments.
Advantages of generics
Reusability. Generic functions can be applied to an open-ended collection of types, while giving a clear contract for the functionality those types must provide.
Static dispatch and optimization. Each use of a generic function is specialized ("monomorphized") to the particular types implementing the trait bounds, which means that (1) invocations of trait methods are static, direct calls to the implementation and (2) the compiler can inline and otherwise optimize these calls.
Inline layout. If a
struct
andenum
type is generic over some type parameterT
, values of typeT
will be laid out inline in thestruct
/enum
, without any indirection.Inference. Since the type parameters to generic functions can usually be inferred, generic functions can help cut down on verbosity in code where explicit conversions or other method calls would usually be necessary.
Precise types. Because generic give a name to the specific type implementing a trait, it is possible to be precise about places where that exact type is required or produced. For example, a function
is guaranteed to consume and produce elements of exactly the same type
T
; it cannot be invoked with parameters of different types that both implementTrait
.Disadvantages of generics
Code size. Specializing generic functions means that the function body is duplicated. The increase in code size must be weighed against the performance benefits of static dispatch.
Homogeneous types. This is the other side of the "precise types" coin: if
T
is a type parameter, it stands for a single actual type. So for example aVec<T>
contains elements of a single concrete type (and, indeed, the vector representation is specialized to lay these out in line). Sometimes heterogeneous collections are useful; see trait objects.Signature verbosity. Heavy use of generics can make it more difficult to read and understand a function's signature.
Examples from the standard library
std::fs::File::open
takes an argument of generic typeAsRef<Path>
. This allows files to be opened conveniently from a string literal"f.txt"
, aPath
, anOsString
, and a few other types.Traits are object-safe if they may be useful as a trait object (C-OBJECT)
Trait objects have some significant limitations: methods invoked through a trait object cannot use generics, and cannot use
Self
except in receiver position.When designing a trait, decide early on whether the trait will be used as an object or as a bound on generics.
If a trait is meant to be used as an object, its methods should take and return trait objects rather than use generics.
A
where
clause ofSelf: Sized
may be used to exclude specific methods from the trait's object. The following trait is not object-safe due to the generic method.Adding a requirement of
Self: Sized
to the generic method excludes it from the trait object and makes the trait object-safe.Advantages of trait objects
Disadvantages of trait objects
Self
type.Examples from the standard library
io::Read
andio::Write
traits are often used as objects.Iterator
trait has several generic methods marked withwhere Self: Sized
to retain the ability to useIterator
as an object.Type safety
Newtypes provide static distinctions (C-NEWTYPE)
Newtypes can statically distinguish between different interpretations of an underlying type.
For example, a
f64
value might be used to represent a quantity in miles or in kilometers. Using newtypes, we can keep track of the intended interpretation:Once we have separated these two types, we can statically ensure that we do not confuse them. For example, the function
cannot accidentally be called with a
Kilometers
value. The compiler will remind us to perform the conversion, thus averting certain catastrophic bugs.Arguments convey meaning through types, not
bool
orOption
(C-CUSTOM-TYPE)Prefer
over
Core types like
bool
,u8
andOption
have many possible interpretations.Use custom types (whether
enum
s,struct
, or tuples) to convey interpretation and invariants. In the above example, it is not immediately clear whattrue
andfalse
are conveying without looking up the argument names, butSmall
andRound
are more suggestive.Using custom types makes it easier to expand the options later on, for example by adding an
ExtraLarge
variant.See the newtype pattern for a no-cost way to wrap existing types with a distinguished name.
Types for a set of flags are
bitflags
, not enums (C-BITFLAG)Rust supports
enum
types with explicitly specified discriminants:Custom discriminants are useful when an
enum
type needs to be serialized to an integer value compatibly with some other system/language. They support "typesafe" APIs: by taking aColor
, rather than an integer, a function is guaranteed to get well-formed inputs, even if it later views those inputs as integers.An
enum
allows an API to request exactly one choice from among many. Sometimes an API's input is instead the presence or absence of a set of flags. In C code, this is often done by having each flag correspond to a particular bit, allowing a single integer to represent, say, 32 or 64 flags. Rust'sbitflags
crate provides a typesafe representation of this pattern.Builders enable construction of complex values (C-BUILDER)
Some data structures are complicated to construct, due to their construction needing:
which can easily lead to a large number of distinct constructors with many arguments each.
If
T
is such a data structure, consider introducing aT
builder:TBuilder
for incrementally configuring aT
value. When possible, choose a better name: e.g.Command
is the builder for a child process,Url
can be created from aParseOptions
.T
.self
to allow chaining.T
.The builder pattern is especially appropriate when building a
T
involves side effects, such as spawning a task or launching a process.In Rust, there are two variants of the builder pattern, differing in the treatment of ownership, as described below.
Non-consuming builders (preferred):
In some cases, constructing the final
T
does not require the builder itself to be consumed. The follow variant onstd::process::Command
is one example:Note that the
spawn
method, which actually uses the builder configuration to spawn a process, takes the builder by immutable reference. This is possible because spawning the process does not require ownership of the configuration data.Because the terminal
spawn
method only needs a reference, the configuration methods take and return a mutable borrow ofself
.The benefit
By using borrows throughout,
Command
can be used conveniently for both one-liner and more complex constructions:Consuming builders:
Sometimes builders must transfer ownership when constructing the final type
T
, meaning that the terminal methods must takeself
rather than&self
.Here, the
stdout
configuration involves passing ownership of anio::Write
, which must be transferred to the task upon construction (inspawn
).When the terminal methods of the builder require ownership, there is a basic tradeoff:
If the other builder methods take/return a mutable borrow, the complex configuration case will work well, but one-liner configuration becomes impossible.
If the other builder methods take/return an owned
self
, one-liners continue to work well but complex configuration is less convenient.Under the rubric of making easy things easy and hard things possible, all builder methods for a consuming builder should take and returned an owned
self
. Then client code works as follows:One-liners work as before, because ownership is threaded through each of the builder methods until being consumed by
spawn
. Complex configuration, however, is more verbose: it requires re-assigning the builder at each step.Dependability
Functions validate their arguments (C-VALIDATE)
Rust APIs do not generally follow the robustness principle: "be conservative in what you send; be liberal in what you accept".
Instead, Rust code should enforce the validity of input whenever practical.
Enforcement can be achieved through the following mechanisms (listed in order of preference).
Static enforcement:
Choose an argument type that rules out bad inputs.
For example, prefer
over
where
Ascii
is a wrapper aroundu8
that guarantees the highest bit is zero; see newtype patterns for more details on creating typesafe wrappers.Static enforcement usually comes at little run-time cost: it pushes the costs to the boundaries (e.g. when a
u8
is first converted into anAscii
). It also catches bugs early, during compilation, rather than through run-time failures.On the other hand, some properties are difficult or impossible to express using types.
Dynamic enforcement:
Validate the input as it is processed (or ahead of time, if necessary). Dynamic checking is often easier to implement than static checking, but has several downsides:
fail!
orResult
/Option
types, which must then be dealt with by client code.Dynamic enforcement with
debug_assert!
:Same as dynamic enforcement, but with the possibility of easily turning off expensive checks for production builds.
Dynamic enforcement with opt-out:
Same as dynamic enforcement, but adds sibling functions that opt out of the checking.
The convention is to mark these opt-out functions with a suffix like
_unchecked
or by placing them in araw
submodule.The unchecked functions can be used judiciously in cases where (1) performance dictates avoiding checks and (2) the client is otherwise confident that the inputs are valid.
Destructors never fail (C-DTOR-FAIL)
Destructors are executed on task failure, and in that context a failing destructor causes the program to abort.
Instead of failing in a destructor, provide a separate method for checking for clean teardown, e.g. a
close
method, that returns aResult
to signal problems.Destructors that may block have alternatives (C-DTOR-BLOCK)
Similarly, destructors should not invoke blocking operations, which can make debugging much more difficult. Again, consider providing a separate method for preparing for an infallible, nonblocking teardown.
Debuggability
All public types implement
Debug
(C-DEBUG)If there are exceptions, they are rare.
Debug
representation is never empty (C-DEBUG-NONEMPTY)Even for conceptually empty values, the
Debug
representation should never be empty.Future proofing
Structs have private fields (C-STRUCT-PRIVATE)
Making a field public is a strong commitment: it pins down a representation choice, and prevents the type from providing any validation or maintaining any invariants on the contents of the field, since clients can mutate it arbitrarily.
Public fields are most appropriate for
struct
types in the C spirit: compound, passive data structures. Otherwise, consider providing getter/setter methods and hiding fields instead.Newtypes encapsulate implementation details (C-NEWTYPE-HIDE)
A newtype can be used to hide representation details while making precise promises to the client.
For example, consider a function
my_transform
that returns a compound iterator type.We wish to hide this type from the client, so that the client's view of the return type is roughly
Iterator<Item = (usize, T)>
. We can do so using the newtype pattern:Aside from simplifying the signature, this use of newtypes allows us to promise less to the client. The client does not know how the result iterator is constructed or represented, which means the representation can change in the future without breaking client code.
In the future the same thing can be accomplished more concisely with the [
impl Trait
] feature but this is currently unstable.Necessities
Public dependencies of a stable crate are stable (C-STABLE)
A crate cannot be stable (>=1.0.0) without all of its public dependencies being stable.
Public dependencies are crates from which types are used in the public API of the current crate.
A crate containing this function cannot be stable unless
other_crate
is also stable.Be careful because public dependencies can sneak in at unexpected places.
Crate and its dependencies have a permissive license (C-PERMISSIVE)
The software produced by the Rust project is dual-licensed, under either the MIT or Apache 2.0 licenses. Crates that simply need the maximum compatibility with the Rust ecosystem are recommended to do the same, in the manner described herein. Other options are described below.
These API guidelines do not provide a detailed explanation of Rust's license, but there is a small amount said in the Rust FAQ. These guidelines are concerned with matters of interoperability with Rust, and are not comprehensive over licensing options.
To apply the Rust license to your project, define the
license
field in yourCargo.toml
as:And toward the end of your README.md:
Besides the dual MIT/Apache-2.0 license, another common licensing approach used by Rust crate authors is to apply a single permissive license such as MIT or BSD. This license scheme is also entirely compatible with Rust's, because it imposes the minimal restrictions of Rust's MIT license.
Crates that desire perfect license compatibility with Rust are not recommended to choose only the Apache license. The Apache license, though it is a permissive license, imposes restrictions beyond the MIT and BSD licenses that can discourage or prevent their use in some scenarios, so Apache-only software cannot be used in some situations where most of the Rust runtime stack can.
The license of a crate's dependencies can affect the restrictions on distribution of the crate itself, so a permissively-licensed crate should generally only depend on permissively-licensed crates.
External Links
License
This guidelines document is licensed under either of
at your option.
Contribution
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in this document by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.