Closed woodruffw closed 1 month ago
While we're at it, we should make sure that TUF refreshes fail gracefully without a network connection.
@emboman13 @omartounsi7 and I were thinking about a possible solution for this. We believe a workaround to this could be to check for the offline flag being set around: https://github.com/sigstore/sigstore-python/blob/db998648010e91624b9b15a36ab6f1ad1b36bf42/sigstore/_internal/tuf.py#L151-L153
In the case where we are offline and there is no cached data, we would make sure the data is still empty and have the other _get functions handle that exception. This would entail passing the offline flag to the other _get functions for ctfe / rekor keys and fulcio certs.
Are we thinking about this correctly? Where do the TUF refreshes come into play?
There's a bigger conceptual/design challenge here: TUF's security/threat model assumes that the TUF repository can always be refreshed, since the way TUF handles things like revocations is by deleting the relevant key from the repo entirely. We'll need to figure out if and how sigstore-python
should compromise on that model, if we choose to support a "full" offline mode.
In the case where we are offline and there is no cached data, we would make sure the data is still empty and have the other _get functions handle that exception. This would entail passing the offline flag to the other _get functions for ctfe / rekor keys and fulcio certs.
If offline is passed and there's no cached TUF state, we should probably produce a hard error (since there's nothing meaningful we can do, verification wise, if we don't have any root of trust).
Would a possible solution be adding a time stamp to the cached state and require it be updated every so often to function? Otherwise this does seem to be quite an impasse
Would a possible solution be adding a time stamp to the cached state and require it be updated every so often to function?
This would be the technical solution, but we'd need to work out how sigstore-python
signals that it's effectively verifying in a "degraded" capacity (similar to how offline Rekor verification is already weaker than online verification).
Some kind of warning on stderr
would probably be sufficient, but the most immediate step here is to coordinate with TUF and figure out if this is an already known use case/if they already have best practices written down somewhere.
This is maybe a slightly off-topic (or too high level) for this specific issue but possibly relevant so I'll write this down:
There seem to be two TUF pain points for sigstore-python:
I think both of these "issues" would be reasonable, but I wanted to see an agreement on the use cases, preferably with more details than I have above before we try to fix things.... It's so easy to "fix" the wrong thing. @woodruffw can you confirm if the above use cases are correct, if there are any others to take into account, and if they have a priority order for you?
Oh and also: I think the decisions here are also very much sigstore system level decisions:
I'm happy to figure out possible solutions with sigstore-python and python-tuf but in the end the answers probably should have wider sigstore ecosystem consensus
- No offline mode: It should be possible to make a user decision to stay offline (I other words "I'm running 'sigstore verify' without a network connection but have TUF metadata and targets caches: would like to use them even if they are expired"):
One of the things we were looking at changing from your previous branch of Python TUF was either using 2 Booleans in the config or just making it a multi-valued integer, such that offline mode would either hard fail if the cached data was expired or try to fetch new up-to-date data online, depending on how the config was set.
@woodruffw can you confirm if the above use cases are correct, if there are any others to take into account, and if they have a priority order for you?
Those look correct to me! In terms of priority, I'd say (2) is higher priority than (1) at the moment -- IMO reducing roundtrips in the "online" case would be good for us to do, but doesn't reflect a current user pain point (at least, not one that's been reported to us).
- Is the description here accurate?
I think so -- the way I'd frame it is "I have all of the local materials needed for a Sigstore root of trust, and I don't want to do any network connections at all." This precludes (initial) support for the signing case, only verifying.
- IIRC you can give the bundle as input?
I actually don't think we directly support this, yet 😅 -- we have a couple of flags that effectively allow the user to build up the root of trust piece-by-piece, but not a flag that just says "use the trust bundle at <path>
for everything." I think that would be good for us to add, though!
- how long can clients use key material without checking for new key material?
Hmm, I'm of a few different minds on this:
I think so -- the way I'd frame it is "I have all of the local materials needed for a Sigstore root of trust, and I don't want to do any network connections at all." This precludes (initial) support for the signing case, only verifying.
Can you clarify this a bit: Do you mean you expect the user to provide all key material (the sigstore root of trust) as input if they want to be offline, or is the idea that _internal.tuf
module should be able to provide cached key material even if the TUF metadata is invalid because of expiry, when it's told to work "offline"?
The former (user provides all key material) sounds like just sigstore-python UI work (if it's not possible already), latter needs at least modifying the _internal.tuf
module but likely should be a python-tuf feature -- this might not be completely trivial but it is a development I'd be interested in seeing.
- No offline mode: It should be possible to make a user decision to stay offline (I other words "I'm running 'sigstore verify' without a network connection but have TUF metadata and targets caches: would like to use them even if they are expired"):
One of the things we were looking at changing from your previous branch of Python TUF was either using 2 Booleans in the config or just making it a multi-valued integer, such that offline mode would either hard fail if the cached data was expired or try to fetch new up-to-date data online, depending on how the config was set.
Note that my branch does not attempt to solve the offline mode case at all, it's only trying to make the root and timestamp requests a little less often. Implementation of "offline mode" (IOW serving cached targets even if metadata is expired) in python-tuf would likely look different (and as I mentioned in previous comment, I don't yet know 100% if it is what sigstore-python wants).
I do agree the "fail fast if any network requests are absolutely needed" would make sense if an "offline mode" was added in python-tuf.
I do agree the "fail fast if any network requests are absolutely needed" would make sense if an "offline mode" was added in python-tuf.
I should've been more clear with my words; this is what we had meant. We wouldn't be making too large of a deviation from what you previously had, mostly just adding an additional flag that would make it so lazy refresh will hard fault instead of grabbing new metadata if metadata is expired. Then on the Sigstore side we would largely just be dealing with setting appropriate expiry times + setting up passing different config files based on if --offline (or even an additional --lazy-refresh flag) was set. That would allow for both a hard offline mode and your existing lazy refresh be accessible for SIgstore users.
This seems like a reasonable potential solution to start work on while specifics on expiry standards are finalized.
Can you clarify this a bit: Do you mean you expect the user to provide all key material (the sigstore root of trust) as input if they want to be offline, or is the idea that
_internal.tuf
module should be able to provide cached key material even if the TUF metadata is invalid because of expiry, when it's told to work "offline"?
I was thinking of it as the former, but I could be (dis)convinced of either approach 🙂
I agree the former would primarily be UI work, rather than TUF work -- in effect it'd just be something like sigstore verify identity --offline --bundle /path/to/trust/bundle
, which would cause us to read the specified trust bundle rather than attempting to update the TUF repository.
My thinking there was that the default value of --bundle
would be whatever's already in the TUF repo, if it's been initialized. If it hasn't, then using --offline
would produce an error. I think that should be fine, but I might have missed something!
I think this sounds quite reasonable.
--offline
that works by side-stepping TUF altogether, just looking into the target cache to find the key material. This is likely doable in sigstore-python only--offline
. There likely is no python-tuf issue for this yet but I think it's an interesting ideaI found one more (possibly different) requirement:
we should make sure that TUF refreshes fail gracefully without a network connection.
@di What does this mean exactly? This does not quite sound like the third point in my previous comment (avoiding requests in situations where we think it's safe)...
What does a graceful failure look like in detail? The common situations we might want to consider:
What does a graceful failure look like in detail?
Generally by this, I just mean "not raise an exception to the user in the CLI".
Opened a draft PR for Python tuf that, if implemented, should provide a clean way to get offline functionality within Sigstore again. The mention from Emile above is an implementation of this fix in a testing setting. https://github.com/theupdateframework/python-tuf/pull/2363
Looking forward to this. Any chance this might ship before end of year ?
thanks for the ping... We discussed the TUF aspects with @woodruffw a couple of weeks ago but it seems I did not update the issue (sorry):
"--offline"
, don't use the tuf client, instead assume that a) either there is a TUF cached trust root and it is the correct one or b) user provides the trust root material explicitly as argumentsThe TUF workaround should be fairly easy to implement. I'm not sure if there are other aspects to --offline
that need to be done.
The following is a hand wave design:
get_ctfe_keys()
) will, when offline, just look into the target cache and return the cached target without verifying the target with the actual tuf clientI'm planning to add internal support for this while fixing #821: see https://github.com/sigstore/sigstore-python/issues/821#issuecomment-1855477728
All the pieces for this have been in place for a while, we just never plumbed it into the CLI it seems. I've opened #1143 to change --offline
to also disable TUF repo updates, and confirmed that it works as expected.
Could you make a new release, please? With Python 3.14 alphas around the corner requiring it, we'd really prefer having some time to actually test and integrate it.
Could you make a new release, please? With Python 3.14 alphas around the corner requiring it, we'd really prefer having some time to actually test and integrate it.
Yep, I'll cut one in a moment. @jku will have to approve, assuming he's online.
Release cut: https://github.com/sigstore/sigstore-python/releases/tag/v3.4.0
Should be available on PyPI shortly!
Once #478 is merged,
sigstore verify
will have an--offline
flag that disables online transparency log lookups.This flag should also disable TUF refreshes, since those require network access. As such, this is a subset/sub-issue of #376.