Pinning Exact Versions of Dependencies

So far our policy for pinning dependencies is implicit? (I didn't find any documentation around this anywhere). So, it would be good to make this explicit and address a few concerns

Why pin dependencies?

From a security minded point of view, the biggest reason is to protect from supply chain attacks.
Reproducibility: If it builds now, it will build tomorrow. I have seen things break in practice, even with "semantically compatible" versions.

Current Situation

By default, Rust + Cargo rely on semantic versioning to choose a dependency version. Looking at our top level cargo.toml file:

rustls = "0.20"
rustls-pemfile = "1"
serde = "1"
serde_json = "1.0.85"

We should choose a consistent way to specify dependencies. Note even specifying serde_json = "1.0.85" does not guarantee Rust will use version 1.0.85 due to semantic versioning, Cargo could update us up to version < 2.0.0

What is our dependency version policy?

We should settle on a policy. Cargo's default policy is "use the latest and greatest version of a dependency". We are currently inheriting this model. That is, building the repository from scratch, cargo might choose different (but compatible) versions to the ones currently used by your local project.

Recommendation

I think we should strive to pin dependencies to specific version. We should explicitly update dependencies as needed manually.

Questions for us:

Assuming you agree on pinning dependencies. There are two ways of doing this:

We could strictly declare dependencies using the "caret requirement" e.g. ^1.2.3.
Use a cargo.lock file to pin dependencies for our binary crates (cargo.lock files are not used for libraries).

I think the cargo.lock file has some advantages:

It is auto generated by cargo. So no need for us to manually mess with it. It is easy to get into dependency hell if we end up trying to manually mess with dependencies too much.
We can easily update dependencies via cargo update
cargo.lock files are commonly used and standard practice. Unlike the more exotic "caret requirement"

Here is some background on specifying dependencies.

Note: Dependency Management is hard [CITATION NEEDED]. For brevity I left our a lot of context around, workspace and Cargo.lock. This issue assumes you're somewhat familiar with these topics. I'm happy to expand on any topic that is unclear to people.

Some historical context: I think we chose to not pin dependencies because we wanted to avoid the manual overhead of checking all our dependencies on a regular schedule to see if they've been updated. What do you think of a policy of pinning dependencies when we make releases to main so that our long-term products are less fragile for users, but our active development branches can stay up to date with the latest and greatest?

Thanks for the historical context.

I think we chose to not pin dependencies because we wanted to avoid the manual overhead of checking all our dependencies on a regular schedule to see if they've been updated

That makes sense and that is how I would choose to develop as well.

We don't have to come to a conclusion on this now, but it is good to start thinking about.

What do you think of a policy of pinning dependencies when we make releases to main so that our long-term products are less fragile for users, but our active development branches can stay up to date with the latest and greatest?

I think this has similar issues to our current way of doing things: We would have stable version on main for end users, but only until we choose to merge our develop branch onto main. So we are constantly updating, just at less frequent intervals, which as you said, would lead to better stability.

After giving this more thought, I realize I am proposing something more extreme:

Proposal

While pinning dependencies has some stability advantages, the main motivation for this proposal is security. I think there is two sides to the security of this project:

1) The cryptographic side. I feel good about this and we got plenty of smart people thinking about this! 2) The implementation/software side. I want to make sure we are thinking about this. Since a flaw in the engineering side could lead to a security issue.

For this proposal, I am mostly worried about a supply chain attack. (Note: I can add relevant links, background, or citation if needed here!).

Threat Model

(This section assume the most paranoid case). Assume we start with a set of crates (dependencies and any transitive dependencies) that are trusted. Any trusted crate which gets a version bump (i.e. the code changes in any way) should no longer be consider safe unless re-reviewed.

An example of a real world attack that has happened: A maintainer of a crate no longer has time to maintain their library. An attacker volunteers to take over maintenance of the crate. The attacker (purposely) introduces a vulnerability or malicious code into the codebase and bumps the minor version number. Anyone who isn't pinning dependencies may update to the latest version of the code and become compromised.

This is most likely with small crates that are maintained by a single person and may not have too many eyes on them.

Proposal

As an initial step, I propose checking in the cargo.lock file for every binary target of this repository. This will pin all our dependencies to the current versions we are using. A cargo update would be necessary to manually update a dependency.

So by default, we would freeze all dependencies and manually update them as needed. Note this is just an initial step, and there is future work to further nail down our security workflow. E.g. for now we can just update dependencies as needed without actually checking the changes to dependencies, a more strict workflow would require the engineer to review patches and the code if they want to update a dependency.

Issues with Using Old Versions

You may ask: Isn't it bad practice to use old, possibly outdated, version of libraries which may have bugs or security vulnerabilities?

The idea here is to rely on Github Dependabot to inform us of any security vulnerabilities that have been published for a dependency, when this happens we can update.

Similarly, if we identify a bug that is affecting us, which has been fixed on a newer version, we can update. This does mean we would no longer be getting bug fixes preemptively by always tracking the latest version of dependencies.

After giving this more thought, I realize I am proposing something more extreme:

Proposal

While pinning dependencies has some stability advantages, the main motivation for this proposal is security. I think there is two sides to the security of this project:

The cryptographic side. I feel good about this and we got plenty of smart people thinking about this!

The implementation/software side. I want to make sure we are thinking about this. Since a flaw in the engineering side could lead to a security issue.

For this proposal, I am mostly worried about a supply chain attack. (Note: I can add relevant links, background, or citation if needed here!).

Addressing your parenthetical note above: I don't think this is a necessary exercise. We are definitely concerned about supply chain attacks.

Threat Model

(This section assume the most paranoid case). Assume we start with a set of crates (dependencies and any transitive dependencies) that are trusted. Any trusted crate which gets a version bump (i.e. the code changes in any way) should no longer be consider safe unless re-reviewed.

An example of a real world attack that has happened: A maintainer of a crate no longer has time to maintain their library. An attacker volunteers to take over maintenance of the crate. The attacker (purposely) introduces a vulnerability or malicious code into the codebase and bumps the minor version number. Anyone who isn't pinning dependencies may update to the latest version of the code and become compromised.

This is most likely with small crates that are maintained by a single person and may not have too many eyes on them.

I agree with this threat model.

Proposal

As an initial step, I propose checking in the cargo.lock file for every binary target of this repository. This will pin all our dependencies to the current versions we are using. A cargo update would be necessary to manually update a dependency.

So by default, we would freeze all dependencies and manually update them as needed. Note this is just an initial step, and there is future work to further nail down our security workflow. E.g. for now we can just update dependencies as needed without actually checking the changes to dependencies, a more strict workflow would require the engineer to review patches and the code if they want to update a dependency.

I support this proposal.

Issues with Using Old Versions

You may ask: Isn't it bad practice to use old, possibly outdated, version of libraries which may have bugs or security vulnerabilities?

The idea here is to rely on Github Dependabot to inform us of any security vulnerabilities that have been published for a dependency, when this happens we can update.

Similarly, if we identify a bug that is affecting us, which has been fixed on a newer version, we can update. This does mean we would no longer be getting bug fixes preemptively by always tracking the latest version of dependencies.

Do we need to make addressing Dependabot issues/warnings an explicit, assigned responsibility? That is, is there a risk that we fail to follow through because everyone on the team is distracted by something else?

One downside of this approach is that the lock file is for the whole workspace. We'd have to put our binaries in other repos in order to commit their lock files.

Do we need to make addressing Dependabot issues/warnings an explicit, assigned responsibility? That is, is there a risk that we fail to follow through because everyone on the team is distracted by something else?

We should have a process for this, but some of them are unfixable without doing work on the dependencies ourselves. For example, the open vulnerability for the time crate comes from mongodb which hasn't been updated yet.

boltlabs-inc / key-mgmt