Open woodruffw opened 2 months ago
I think the expected behaviour sounds reasonable.
There is a related question to consider -- in a scenario where you have "distributed workers", maybe what you really want is a bunch of "read-only" workers that operate without ever connecting to the repository (at least for metadata), and one writing tuf client that actually does the updates at regular intervals.
Previously we tried to make an offline mode that would be use friendly -- usable by CLI apps -- and that turned out complicated (compared to the potential advantages). The "offline mode" described above (where it's ok to just immediately fail if the local metadata is not up-to-date and someone promises to keep it updated) would be simple to add.
"dumb read-only mode" or IO abstraction (or both) sound like things that could be added as optional features to ngclient.
StorageBackendInterface
or something)find_cached_target
and download_target
should work as well, we'll just need to make sure the optional filepath argument still makes sense -- likely that only makes sense with the default filesystem implementationThere is a related question to consider -- in a scenario where you have "distributed workers", maybe what you really want is a bunch of "read-only" workers that operate without ever connecting to the repository (at least for metadata), and one writing tuf client that actually does the updates at regular intervals
Thanks for extrapolating this! This is indeed the underlying scenario, and probably is a more accurate encapsulation of what I actually need 🙂
Description of issue or feature request:
Right now,
tuf.ngclient
is heavily tied to local system I/O: it assumes a metadata directory on disk that can be read/written. For example:https://github.com/theupdateframework/python-tuf/blob/4d2ff8d37d30e94dbc0fe2cfa42bd46d2bb72414/tuf/ngclient/updater.py#L293-L312
This is problematic in distributed worker setups like Warehouse (PyPI), where each worker has its own container/entire VM and thus can't easily share on-disk TUF repos. In particular, this causes both reliability and security concerns:
This problem was noted a few years back, before
tuf.ngclient
was created: https://github.com/theupdateframework/python-tuf/issues/1009. The solution then was to add a filesystem abstraction to thetuf.metadata
APIs, which was done via https://github.com/secure-systems-lab/securesystemslib/pull/232 and https://github.com/theupdateframework/python-tuf/issues/1009. However, this abstraction wasn't added to thengclient
APIs, only to the low-levelmetadata
ones.Current behavior:
tuf.ngclient
currently assumes that it can perform persistent local I/O for its repository.Expected behavior:
tuf.ngclient
should support an I/O abstraction (such as the pre-existingStorageBackendInterface
, if suitable) for persistent repo operations, enabling use in distributed deployments.