Repository update races with clients

znewman01 commented 2 years ago

Two related issues:

If you don't publish the repository careful (i.e., in the same order that the spec tells you to update), you can run into lots of race conditions.
Even if you do publish the repository in-order and have consistent snapshots turned on, there's still a possible race if this happens halfway through the client flow.

There are a number of race conditions which can happen if the repository update is not atomic. For instance, if you publish a repository over HTTP and scp -R the files to the repository host, they might get copied in a weird order. I think this is pretty common in real deployments: the repository manipulation is done one one machine, then gets pushed to mirrors. The point of consistent snapshots, as I understand them, was to prevent this from happening, but they require that the push be done in the correct order (i.e., files are deployed in the order described in the spec).

But even if you can deploy atomically, there's still a race when timestamp keys are updated:

Client initiates update, grabs all the root metadata
Repo publishes (atomically) an updating rotating the timestamp keys, and including a timestamp with the new key.
Client continues with update, grabs the timestamp file, and verifying signature fails.

In general, there's pretty limited guidance in the specification here. I don't know that it's worth trying to eliminate this one race. But maybe some client guidance about retries would be helpful.

EDIT: fix a typo

JustinCappos commented 2 years ago

Better guidance for clients makes sense to me. One would expect a single retry would resolve this. If a more pathological case is happening, it feels more like a repository trying to prevent an update than an actual race condition...

lukpueh commented 2 years ago

I agree with @JustinCappos that a single retry after a short wait should mitigate the problem with changing keys in the delegating metadata (root or targets) in the middle of an update, because it is unlikely that this happens multiple times in rapid succession.

For the timestamp-snapshot-targets relationship, on the other hand, this could be a real problem. In the worst case, a client could always see timestamp metadata, which references snapshot metadata that is not yet available (or snapshot, which references targets that are not yet available).

IMO it's a good idea to add a recommendation to publish a given consistent snapshot either in a transaction, or in bottom-up order (1. targets, 2. snapshot, 3. timestamp). #91 is probably a good fit for this.

joshuagl commented 2 years ago

Good discussion here, thank you all. I think we should indeed address both: a) adding guidance about retries to the client workflow b) capturing repository publishing recommendations (atomic/single transaction OR bottom-up)

Let's address b) in #91 as @lukpueh suggested and a) in this issue?

theupdateframework / specification

Repository update races with clients #223