notaryproject / tuf

The Update Framework for OCI Registries
12 stars 11 forks source link

Scaling timestamp and snapshot #5

Open mnm678 opened 3 years ago

mnm678 commented 3 years ago

Snapshot metadata needs to scale to the registry.

In addition, timestamp metadata must be frequently updated to remain valid. As it is frequently updated, users will need a mechanism to discover the most recent version of timestamp metadata on the registry and the registry will need to delete old versions.

Relevant proposals include TAP 16 and Transparency Logs

This issue is part of #2

justincormack commented 3 years ago

Registries are usually built on data stores like Amazon S3 that don't have efficient consistent list operations, so snapshots are hard. Some implementations will use a database of content, but adding the requirement to keep up to data generated documents listing the whole contents is very burdensome.

sudo-bmitch commented 3 years ago

This is one of those areas where I feel an external server handling the timestamps should also be used to handle concurrency issues with multiple updates to the snapshot by different builds. For scaling a large snapshot, we could shard the data, using an OCI index that points to a tree of OCI indices and manifests/artifacts. It then becomes a balance between the size (bandwidth) and number of shards (round trip time) to be efficient for both clients and registries.